Cyrus 2.4 replication bug, plus a couple of questions
David Carter
dpc22 at cam.ac.uk
Mon Nov 29 02:48:41 EST 2010
On Mon, 29 Nov 2010, Bron Gondwana wrote:
> "renumber_one_record" - which will causes the master end to renumber
> that UID higher than any which currently exist at EITHER end. This
> means that it will be a pure append, which will work correctly.
>
> But you're right, we should avoid the recno++ so that at least the
> rest of the updates will go through correctly the first time and won't
> need to be re-synced again in the full update!
That's rather nice.
Just to be clear, sync_client actually gives up the ghost:
Nov 28 14:20:13 cyrus-38 sync_client[30920]:
inefficient replication (47 > 1) user.dpc99
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
MAILBOX received NO response: IMAP_MAILBOX_CRC Checksum Failure
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
do_folders(): update failed: user.dpc99 'Mailbox format corruption detected'
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
IOERROR: Mailbox format corruption detected
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
Error in do_user(dpc99): bailing out!
but works if I restart it.
I suspect that the unwanted recno++ explains the problems that
sgcarr at civeng.adelaide.edu.au reported on info-cyrus this morning given
that they had only just tried to upgrade from 2.3.16 to 2.4.4. He said:
> So I would run reconstruct -s user.USER1 and then restart the
> sync_client but the error would occur on another mailbox of the same
> user.
I suspect that the second pass would have worked without any reconstructs.
>> Quick Question about cyr_expire in 2.4
>> ======================================
>>
>> The replication engine in 2.4 tries to track expunged messages. Does
>> this mean that it is no longer safe to run cyr_expire on replica
>> servers?
>
> ... ish. cyr_expire has a lot of different functionality merged into
> it unfortunately. I assume you're talking about using it to delete
> messages over a certain age based on annotations here, rather than
> cleaning up expunged messages.
>
> Currently if you expunge on the replica end, then a full mailbox sync
> will notice the fact and should expunge it on the master as well.
> This might not be working, which would be a bug[tm].
Expiring/unlinking expunged messages is what I had in mind. I'll have
a play and see what works.
>> [Replication over SSH links].
>
> Definitely. I've been complaining about the need for SASL layers for
> a while. It makes setting up and debugging sync a lot more of a pain
> than it should be! At least it should be possible to add the same
> password to the config file on the replica rather than needing to
> add a password to /etc/sasldb2 or friends!
I'll see what I can knock up.
One other tiny change that I made was to add a dlist_print_debug() which
prints a subset of each entry to stdout so that I can watch what is going
on without being swamped by message content. This is a problem that I
always had with the full telemetry logging stuff.
--
David Carter Email: David.Carter at ucs.cam.ac.uk
University Computing Service, Phone: (01223) 334502
New Museums Site, Pembroke Street, Fax: (01223) 334679
Cambridge UK. CB2 3QH.
More information about the Cyrus-devel
mailing list