Cyrus 2.4 replication bug, plus a couple of questions
Bron Gondwana
brong at fastmail.fm
Sun Nov 28 18:29:51 EST 2010
On Sun, Nov 28, 2010 at 06:04:20PM +0000, David Carter wrote:
> I've set up a little Cyrus-2.4 testrig and moved my own email to it
> to see what breaks. So far so good, apart from a little bug in the
> 2.4 replication code, as follows:
>
> Replication bug
> ===============
>
> sync_server.c: do_mailbox() gets thoroughly confused if a mailbox on the
> master has index_records which do not appear in the replica mailbox.
Ok...
> An unwanted recno++ means that the two ends get out of sync and
> sync_server starts to compare the wrong messages.
Oh yeah - that basically means that unless they fall back into sync,
they're going to out "forever".
> Trivial patch attached. I can report this in bugzilla as well if
> that is the desired procedure, but this looks like the obvious fix.
If it's going to go in a 'stable' release then it needs to go into
Bugzilla. If you create your own git repository somewhere (github
is pretty good) then you can create patches and just ask one of us
to pull them directly into the tree rather than applying diffs.
I can do the bugzilla entry though. This looks trivially correct,
and definitely worth sneaking into 2.4.5.
> I imagine that the common case where this would happen is where
> someone had been running replication in 2.3 with different sets of
> expunged messages at the two ends. Either because only one end had
> been running with delayed expunge enabled, or different cyr_expire
> configurations. Or in my case, a freshly generated 2.3 replica
> without any expunged messages.
While it kind of sucks from a "has to re-run the whole lot"
perspective, this bug doesn't hurt too badly, because...
> Curiously while the first attempt at "sync_client -u" bailed out
> with a IMAP_MAILBOX_CRC, a second attempt worked after a bit of ping
> pong with highestmodseq and the modseqs on individual message. This
> lead to an partial upload list without any of the expunged messages
> from the master.
Correct - that's what was supposed to happen!
http://www.mail-archive.com/cyrus-devel@lists.andrew.cmu.edu/msg01355.html
Quick summary: if the message exists on the master but not on the
replica, then it's possible that the replica was a master at some
point, and told a client that this UID didn't exist any more. It
is illegal for messages to re-appear after they've been expunged,
so the only sane approaches are bump the UIDVALIDITY (shit, causes
a full resync) or re-number that message.
This is what actually happens. There is a CRC failure response,
which causes sync_client to request a full mailbox dump from the
sync_server. This is in sync_client: mailbox_full_update. This
handles all the cases (note my visible tabs in the copy-pasta ;)
>------- /* same UID - compare the records */
>------- if (rrecord.uid == mrecord.uid) {
>------->-------/* hasn't been changed already, check it */
>------->-------if (mrecord.modseq <= highestmodseq) {
>------->------- r = compare_one_record(mailbox,
>------->------->------->------->------- &mrecord, &rrecord,
>------->------->------->------->------- kaction);
>------->------- if (r) goto done;
>------->-------}
>------->-------/* increment both */
>------->-------recno++;
>------->-------ki = ki->next;
>------- }
>------- else if (rrecord.uid > mrecord.uid) {
>------->-------/* record only exists on the master */
>------->-------r = renumber_one_record(&mrecord, kaction);
>------->-------if (r) goto done;
>------->-------/* only increment master */
>------->-------recno++;
>------- }
>------- else {
>------->-------/* record only exists on the replica */
>------->-------r = copyback_one_record(mailbox, &rrecord, kaction);
>------->-------if (r) goto done;
>------->-------/* only increment replica */
>------->-------ki = ki->next;
>------- }
"renumber_one_record" - which will causes the master end to renumber
that UID higher than any which currently exist at EITHER end. This
means that it will be a pure append, which will work correctly.
But you're right, we should avoid the recno++ so that at least the
rest of the updates will go through correctly the first time and won't
need to be re-synced again in the full update!
> Quick Question about cyr_expire in 2.4
> ======================================
>
> The replication engine in 2.4 tries to track expunged messages. Does
> this mean that it is no longer safe to run cyr_expire on replica
> servers?
... ish. cyr_expire has a lot of different functionality merged into
it unfortunately. I assume you're talking about using it to delete
messages over a certain age based on annotations here, rather than
cleaning up expunged messages.
Currently if you expunge on the replica end, then a full mailbox sync
will notice the fact and should expunge it on the master as well.
This might not be working, which would be a bug[tm].
> I'm a bit concerned that this would stop us from running
> master/master, with a subset of accounts running on each server in a
> pair.
That should be fine, so long as you have a sync_client running at each
end pointing to the other end, because the mailbox change will cause
a triggered sync with THAT end as the master, and the expunge logic
works fine that way :)
cyr_expire to remove old expunged messages from the cyrus.index while
unlinking the spool file - that's asynchronous. Either end can do it
whenever they want. The only time it affects replication is if the
"DELETEDMODSEQ" field (used for both QRESYNC and replication in exactly
the same way!) is higher than the modseq on the replica. In that case
there may be some missing expunge data, so a full sync is required. It
will syslog that "inefficient replication" is being used in this case.
So either if the replication has been offline for over a week (in our
case) or if immediate delete is enabled...
NOW - there's a separate bug that's causing CRC failues every time
a mailbox gets repacked - I've got a fix in git that will go to 2.4.5.
I introduced the bug in 2.4.4 while rewriting the upgrade code :(
http://bugzilla.cyrusimap.org/show_bug.cgi?id=3347
> cyrus-cvs list
> ==============
>
> cyrus-cvs at lists.andrew.cmu.edu dried up a few months back. Is there
> an equivalent list for commits into
> git://git.cyrusimap.org/cyrus-imapd/?
> This might be a naive question: I only have limited experience of git.
>
> http://www.cyrusimap.org/mediawiki/index.php/Cyrus_Mailing_Lists mentions
> cyrus-cvs, but no replacement list.
That would be Jeroen's fault ;) Hit him. It should be pretty easy to
add a commit hook to generate emails in git.
> A thank you to Bron
> ===================
>
> The mailbox abstraction code in 2.4 is much nicer to work with than
> the low level hackery which used to go on in earlier versions of
> Cyrus.
Oh isn't it just! I'm really quite happy with the state of mailbox.c
and to a lesser extent index.c now. mboxlist.c is still horribly
nasty, and is next on my list to clean up. There should be a struct
containing all the necessary facts about a "mailbox name" such that
there is no such thing as a bare internal name, and the conversations
in and out internal format only happen once at the edges. Far too
many little subtle bugs in the handling of separators and virtual
domains and converstion from mailbox name to owner userid and...
well, I've already fixed a couple of them, but forcing it to not be
the same datatype (char *) for both would go a long way towards
ensuring these bugs don't remain.
> I managed to port by own (2003 era) replication code and other local
> changes in about 3 days. Plus one day to work out what was going on
> with the new FLAG_SEEN stuff in 2.4 and a few hours today working
> out why the normal 2.4 replication code was having a fit (see
> above).
Oh yeah - FLAG_SEEN. That's a bit complex, but I think worthwhile
for the IO saving in the common case. I'm seriously thinking of
pulling it out of index.c and append.c and merging it down into
mailbox.c actually. Hiding it behind an API and adding a boolean
"isseen" to the "struct index_record" such that users don't care
(once they've opened the mailbox as a specific user) about the
underlying implemenation. Make mailbox_rewrite_index_record
and mailbox_append_index_record set the right dirty flag in the
'struct mailbox' and update the right datastructures so that it
just works!
I'm not totally happy with the number of different places that is
hand-managed at the moment!
> mailbox_rewrite_index_record() and mailbox_append_index_record()
> probably generate a little more disk I/O than what was there before
> as they can only work with a single message at a time. But if
> Fastmail don't care, then I would be very surprised if I notice.
Surprisingly little extra I suspect. The only case where it would
happen is where a record splits over a block boundary. Usually at
least the flags would have been rewritten, and the modseq always
has to as well now (plus the CRC32!)
The saving in terms of not repacking the cyrus.index and cyrus.cache
on every expunge (even if not unlinking the spool) more than makes
up for it!
> I'm a little concerned that the new replication engine doesn't
> appear to be able to cope with messages reappearing in the middle of
> a mailbox, but That Should Never Happen.
See above. It copes. It copes by falling back to full sync, and
promoting the UID of the reappearing message. There is no other
legal way to do it!
> The simplicity of 2.4 has a
> lot going for it (says the guy who wrote sync_combine_commit() plus
> sync_append_commit(), and probably quite a lot of the other really
> nasty convolved code in 2.3).
Simple is good :) I wanted a model that could have sane locking
and where changes to the data were shepherded through a very clear
and very simple API, so there weren't too many bits to keep in sync!
> The replication code in 2.4 is much nicer and looks like something
> that I can use in production, which was never the case with the code
> in 2.3. The giant central lock was always a show stopper for me. The
> only thing that I have interfered with is to run the replication
> protocol over SSH links so that I can use SSH key based
> authentication rather than adding sync_host, sync_authname and
> sync_password (ugh!) into configuration files. That's a relatively
> modest bit of replumbing of the various prot channels: would there
> be any interesting in turning this into a more general patch?
Definitely. I've been complaining about the need for SASL layers for
a while. It makes setting up and debugging sync a lot more of a pain
than it should be! At least it should be possible to add the same
password to the config file on the replica rather than needing to
add a password to /etc/sasldb2 or friends!
Bron.
More information about the Cyrus-devel
mailing list