*IMPORTANT* - bugfix sync_append_commit index breakage
Ken Murchison
murch at andrew.cmu.edu
Sat Sep 1 12:21:30 EDT 2007
I applied this patch to CVS, but I'm wondering why the original code
writes directly to the existing cyrus.index instead of a
cyrus.index.NEW, as is done in other places. I don't know if this was
intentional by the previous authors, or an oversight. I'm going to
investigate.
Bron Gondwana wrote:
> We've had issues with broken indexes on the old master after a failover,
> and puzzled for ages over the cause. This is the reason behind the
> SIGQUIT patch which would mitigate this by reducing the "unfinished
> business" on a standard shutdown.
>
> But still, the symptoms were weird.
>
> I'm pretty sure I've figured out the cause, and the attached patch
> (compile tested only) fixes it... unfortunately it's a race condition
> so it's hard to test.
>
> Ken - it would be great if you can check this today and push it to CVS
> if you agree with me. It's a nasty one - especially since the symptoms
> are offset index records right until something that's not a sync_server
> append comes along and writes to the correct place in the index file
> again.
>
> On a more structural note - it would be great to re-factor the lower
> level APIs (mailbox.c, mboxlist.c, etc) to provide everything that
> sync_server needs manipulate their data structures and remove all the
> low level fiddling copy-and-pasted stuff from the sync layer. This
> comes back a bit to David's comments today about the difficulty of
> having patches flying around - I suspect he copied and pasted so he
> wasn't throwing tentacles deep through the main codebase - but the
> end result is more brittle software and poor abstraction. We need
> to avoid doing that if there's a bunch of us all hacking in our own
> little patch space!
>
> Bron.
>
>
> ------------------------------------------------------------------------
>
> Bugfix sync index append
>
> After a failover, we sometimes get index files
> with duplicate records in them and other nastiness.
>
> The cyrus index file format has this clever "ignore"
> junk at the end until the exists count changes trick.
>
> sync_commit.c contains a copy and pasted "index append"
> logic, but it seeks to the end of the file rather than
> start_offset + exists * record_size. This is bogus and
> means that at some random future stage a real record will
> be overwritten and two of the old bogus record will still
> exist. Crap.
>
> This patch fixes that, obviously.
> Index: cyrus-imapd-2.3.9/imap/sync_commit.c
> ===================================================================
> --- cyrus-imapd-2.3.9.orig/imap/sync_commit.c 2007-08-31 10:13:28.000000000 -0400
> +++ cyrus-imapd-2.3.9/imap/sync_commit.c 2007-08-31 10:26:03.000000000 -0400
> @@ -481,6 +481,7 @@
> unsigned long newflagged;
> char target[MAX_MAILBOX_PATH];
> int n, r = 0;
> + long last_offset;
> struct txn *tid = NULL;
> modseq_t highestmodseq = 0;
>
> @@ -577,9 +578,11 @@
> if (retry_writev(mailbox->cache_fd, cache_iovec, upload_list->count) < 0)
> goto fail;
>
> - lseek(mailbox->index_fd, mailbox->index_size, SEEK_SET);
> +
> + last_offset = mailbox->start_offset + mailbox->exists * mailbox->record_size;
> + lseek(mailbox->index_fd, last_offset, SEEK_SET);
> if (retry_write(mailbox->index_fd, index_chunk,
> - upload_list->count * INDEX_RECORD_SIZE) < 0)
> + upload_list->count * mailbox->record_size) < 0)
> goto fail;
>
>
--
Kenneth Murchison
Systems Programmer
Project Cyrus Developer/Maintainer
Carnegie Mellon University
More information about the Cyrus-devel
mailing list