Cyrus 2.4 replication bug, plus a couple of questions

David Carter dpc22 at cam.ac.uk
Mon Nov 29 02:48:41 EST 2010


On Mon, 29 Nov 2010, Bron Gondwana wrote:

> "renumber_one_record" - which will causes the master end to renumber
> that UID higher than any which currently exist at EITHER end.  This
> means that it will be a pure append,  which will work correctly.
>
> But you're right, we should avoid the recno++ so that at least the
> rest of the updates will go through correctly the first time and won't
> need to be re-synced again in the full update!

That's rather nice.

Just to be clear, sync_client actually gives up the ghost:

Nov 28 14:20:13 cyrus-38 sync_client[30920]:
   inefficient replication (47 > 1) user.dpc99
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
   MAILBOX received NO response: IMAP_MAILBOX_CRC Checksum Failure
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
   do_folders(): update failed: user.dpc99 'Mailbox format corruption detected'
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
   IOERROR: Mailbox format corruption detected
Nov 28 14:20:46 cyrus-38 sync_client[30920]:
   Error in do_user(dpc99): bailing out!

but works if I restart it.

I suspect that the unwanted recno++ explains the problems that 
sgcarr at civeng.adelaide.edu.au reported on info-cyrus this morning given 
that they had only just tried to upgrade from 2.3.16 to 2.4.4. He said:

> So I would run reconstruct -s user.USER1 and then restart the
> sync_client but the error would occur on another mailbox of the same
> user.

I suspect that the second pass would have worked without any reconstructs.

>> Quick Question about cyr_expire in 2.4
>> ======================================
>>
>> The replication engine in 2.4 tries to track expunged messages. Does
>> this mean that it is no longer safe to run cyr_expire on replica
>> servers?
>
> ... ish.  cyr_expire has a lot of different functionality merged into
> it unfortunately.  I assume you're talking about using it to delete
> messages over a certain age based on annotations here, rather than
> cleaning up expunged messages.
>
> Currently if you expunge on the replica end, then a full mailbox sync
> will notice the fact and should expunge it on the master as well.
> This might not be working, which would be a bug[tm].

Expiring/unlinking expunged messages is what I had in mind. I'll have
a play and see what works.

>> [Replication over SSH links].
>
> Definitely.  I've been complaining about the need for SASL layers for
> a while.  It makes setting up and debugging sync a lot more of a pain
> than it should be!  At least it should be possible to add the same
> password to the config file on the replica rather than needing to
> add a password to /etc/sasldb2 or friends!

I'll see what I can knock up.

One other tiny change that I made was to add a dlist_print_debug() which 
prints a subset of each entry to stdout so that I can watch what is going 
on without being swamped by message content. This is a problem that I 
always had with the full telemetry logging stuff.

-- 
David Carter                             Email: David.Carter at ucs.cam.ac.uk
University Computing Service,            Phone: (01223) 334502
New Museums Site, Pembroke Street,       Fax:   (01223) 334679
Cambridge UK. CB2 3QH.


More information about the Cyrus-devel mailing list