*IMPORTANT* - bugfix sync_append_commit index breakage

Ken Murchison murch at andrew.cmu.edu
Sat Sep 1 12:21:30 EDT 2007


I applied this patch to CVS, but I'm wondering why the original code 
writes directly to the existing cyrus.index instead of a 
cyrus.index.NEW, as is done in other places.  I don't know if this was 
intentional by the previous authors, or an oversight.  I'm going to 
investigate.


Bron Gondwana wrote:
> We've had issues with broken indexes on the old master after a failover,
> and puzzled for ages over the cause.  This is the reason behind the
> SIGQUIT patch which would mitigate this by reducing the "unfinished
> business" on a standard shutdown.
> 
> But still, the symptoms were weird.
> 
> I'm pretty sure I've figured out the cause, and the attached patch
> (compile tested only) fixes it...  unfortunately it's a race condition
> so it's hard to test.
> 
> Ken - it would be great if you can check this today and push it to CVS
> if you agree with me.  It's a nasty one - especially since the symptoms
> are offset index records right until something that's not a sync_server
> append comes along and writes to the correct place in the index file
> again.
> 
> On a more structural note - it would be great to re-factor the lower
> level APIs (mailbox.c, mboxlist.c, etc) to provide everything that
> sync_server needs manipulate their data structures and remove all the
> low level fiddling copy-and-pasted stuff from the sync layer.  This
> comes back a bit to David's comments today about the difficulty of
> having patches flying around - I suspect he copied and pasted so he
> wasn't throwing tentacles deep through the main codebase - but the
> end result is more brittle software and poor abstraction.  We need
> to avoid doing that if there's a bunch of us all hacking in our own
> little patch space!
> 
> Bron.
> 
> 
> ------------------------------------------------------------------------
> 
> Bugfix sync index append
> 
> After a failover, we sometimes get index files
> with duplicate records in them and other nastiness.
> 
> The cyrus index file format has this clever "ignore"
> junk at the end until the exists count changes trick.
> 
> sync_commit.c contains a copy and pasted "index append" 
> logic, but it seeks to the end of the file rather than
> start_offset + exists * record_size.  This is bogus and
> means that at some random future stage a real record will
> be overwritten and two of the old bogus record will still
> exist.  Crap.
> 
> This patch fixes that, obviously.
> Index: cyrus-imapd-2.3.9/imap/sync_commit.c
> ===================================================================
> --- cyrus-imapd-2.3.9.orig/imap/sync_commit.c	2007-08-31 10:13:28.000000000 -0400
> +++ cyrus-imapd-2.3.9/imap/sync_commit.c	2007-08-31 10:26:03.000000000 -0400
> @@ -481,6 +481,7 @@
>      unsigned long newflagged;
>      char  target[MAX_MAILBOX_PATH];
>      int   n, r = 0;
> +    long   last_offset;
>      struct txn *tid = NULL;
>      modseq_t highestmodseq = 0;
>  
> @@ -577,9 +578,11 @@
>      if (retry_writev(mailbox->cache_fd, cache_iovec, upload_list->count) < 0)
>          goto fail;
>  
> -    lseek(mailbox->index_fd, mailbox->index_size, SEEK_SET);
> +    
> +    last_offset = mailbox->start_offset + mailbox->exists * mailbox->record_size;
> +    lseek(mailbox->index_fd, last_offset, SEEK_SET);
>      if (retry_write(mailbox->index_fd, index_chunk,
> -                    upload_list->count * INDEX_RECORD_SIZE) < 0)
> +                    upload_list->count * mailbox->record_size) < 0)
>          goto fail;
>  
>  


-- 
Kenneth Murchison
Systems Programmer
Project Cyrus Developer/Maintainer
Carnegie Mellon University


More information about the Cyrus-devel mailing list