*IMPORTANT* - bugfix sync_append_commit index breakage

Ken Murchison murch at andrew.cmu.edu
Fri Aug 31 11:16:53 EDT 2007


I think your patch makes sense, but I'm not sure when "The cyrus index 
file format has this clever "ignore" junk at the end until the exists 
count changes trick" means.  I know we leave junk in the cache file as a 
result of delayed expunge which gets cleaned up later, but I'm pretty 
sure the the index file is always tightly packed.  The is extra space in 
the index header, but there shouldn't be any between the index records.


Bron Gondwana wrote:
> We've had issues with broken indexes on the old master after a failover,
> and puzzled for ages over the cause.  This is the reason behind the
> SIGQUIT patch which would mitigate this by reducing the "unfinished
> business" on a standard shutdown.
> 
> But still, the symptoms were weird.
> 
> I'm pretty sure I've figured out the cause, and the attached patch
> (compile tested only) fixes it...  unfortunately it's a race condition
> so it's hard to test.
> 
> Ken - it would be great if you can check this today and push it to CVS
> if you agree with me.  It's a nasty one - especially since the symptoms
> are offset index records right until something that's not a sync_server
> append comes along and writes to the correct place in the index file
> again.
> 
> On a more structural note - it would be great to re-factor the lower
> level APIs (mailbox.c, mboxlist.c, etc) to provide everything that
> sync_server needs manipulate their data structures and remove all the
> low level fiddling copy-and-pasted stuff from the sync layer.  This
> comes back a bit to David's comments today about the difficulty of
> having patches flying around - I suspect he copied and pasted so he
> wasn't throwing tentacles deep through the main codebase - but the
> end result is more brittle software and poor abstraction.  We need
> to avoid doing that if there's a bunch of us all hacking in our own
> little patch space!
> 
> Bron.
> 
> 
> ------------------------------------------------------------------------
> 
> Bugfix sync index append
> 
> After a failover, we sometimes get index files
> with duplicate records in them and other nastiness.
> 
> The cyrus index file format has this clever "ignore"
> junk at the end until the exists count changes trick.
> 
> sync_commit.c contains a copy and pasted "index append" 
> logic, but it seeks to the end of the file rather than
> start_offset + exists * record_size.  This is bogus and
> means that at some random future stage a real record will
> be overwritten and two of the old bogus record will still
> exist.  Crap.
> 
> This patch fixes that, obviously.
> Index: cyrus-imapd-2.3.9/imap/sync_commit.c
> ===================================================================
> --- cyrus-imapd-2.3.9.orig/imap/sync_commit.c	2007-08-31 10:13:28.000000000 -0400
> +++ cyrus-imapd-2.3.9/imap/sync_commit.c	2007-08-31 10:26:03.000000000 -0400
> @@ -481,6 +481,7 @@
>      unsigned long newflagged;
>      char  target[MAX_MAILBOX_PATH];
>      int   n, r = 0;
> +    long   last_offset;
>      struct txn *tid = NULL;
>      modseq_t highestmodseq = 0;
>  
> @@ -577,9 +578,11 @@
>      if (retry_writev(mailbox->cache_fd, cache_iovec, upload_list->count) < 0)
>          goto fail;
>  
> -    lseek(mailbox->index_fd, mailbox->index_size, SEEK_SET);
> +    
> +    last_offset = mailbox->start_offset + mailbox->exists * mailbox->record_size;
> +    lseek(mailbox->index_fd, last_offset, SEEK_SET);
>      if (retry_write(mailbox->index_fd, index_chunk,
> -                    upload_list->count * INDEX_RECORD_SIZE) < 0)
> +                    upload_list->count * mailbox->record_size) < 0)
>          goto fail;
>  
>  


-- 
Kenneth Murchison
Systems Programmer
Project Cyrus Developer/Maintainer
Carnegie Mellon University


More information about the Cyrus-devel mailing list