What's happening in future branch - March 25

Bron Gondwana brong at fastmail.fm
Thu Mar 25 07:28:53 EDT 2010


Gosh, it's getting awfully close to my self imposed deadline of April
isn't it - though I think I'll be pulling an Ubuntu and saying "I
meant the end of April, honest".  Sysadmin stuff got in the way for
rather longer than I intended.

Due to making changes all over the place, I currently have a giant change
entry on the future branch with all the changes on it.  I'll try to factor
them out into individual line items later, keeping it reviewable.  I'm not
going to try to make every intermediate stage work - let alone even
compile - because it's too much makework for no benefit.

That said, here's where things are! (long, but hopefully interesting to
at least some other people... if nothing else a handy memory jogger for
me if I get side-tracked at all)

Rewritten (greatly simplified) API for getting file names:
==========================================================

char *fname = mailbox_meta_fname(&mailbox, META_INDEX);
char *newfname = mailbox_meta_newfname(&mailbox, META_INDEX);

(uses a separate static buffer so you can just do this and then
your rename logic!)

char *fname = mailbox_message_fname(&mailbox, uid);

So that simplifies HEAPS of code and hides all the "is it meta"
behind one interface.

Mboxlist functions return a struct
==================================

struct mboxlist_entry mbentry;
r = mboxlist_lookup(name, &mbentry, &tid);

mbentry.mbtype, mbentry.partition, mbentry.acl

Simplifies parameter passing to functions considerably

Locking
=======

Here's the big one!  I'll write a separate section on this in more
detail, but here are the basic contracts:

New file: cyrus.lock - empty file.

exclusive lock on cyrus.lock:
  * free reign with all files.  Delete what you like, rewrite
    what you like
  * no other locks required

shared lock on cyrus.lock: 
  * MUST NOT replace cyrus.index or cyrus.cache files (change inode)
  * MUST NOT change any files at all without cyrus.index lock
  * MAY rewrite cyrus.squat.

exclusive lock on cyrus.index:
  * MUST have a shared lock on cyrus.lock
  * MAY rewrite flags/modseq on any index record
  * MAY append new index records
  * MAY append to cyrus.cache
  * MUST NOT rewrite older parts of cyrus.cache
  * MAY rewrite cyrus.header (write .NEW, rename)
  * MAY rewrite user.seen file

shared lock on cyrus.index:
  * optional - gives consistent reads.  Otherwise modseqs higher than
    last read header may be found, seen may be out of sync.  I have a
    simple concurrency test case that can see unintended data by being
    clever.

All this infrastructure is in place in the future branch now, though
untested so far.  APIs are:

mailbox_open_shared(name, auth_state, &mailbox);
mailbox_open_exclusive(name, auth_state, &mailbox);

mailbox_lock_index(&mailbox);
mailbox_unlock_index(&mailbox);

mailbox_close(&mailbox);

CRCs and automatic bookkeeping
==============================

(remember the discussion on recno replacing msgno because expunged records
 remain in cyrus.index rather than cyrus.expunge, so need to be skipped?
 It's necessary for the above, because cyr_expire will get an exclusive lock
 on the mailbox for the duration of its index rewrite!)

Move "deleted / flagged / answered" meta count management into mailbox.c:

mailbox_rewrite_index_record(&mailbox, recno, &record);
 * reads the OLD record again, updates counts, updates CRC32 values including
   XOR_CRC which is the XOR of all record CRCs.
 * updates "exists" if it's an expunge.
 * updates modseq value.  Calls:

mailbox_index_dirty(&mailbox)
 * anything doing a write to ANYTHING does this.  It sets a flag,
   increments the highestmodseq and grabs a timestamp.  All changes
   use these values.

mailbox_header_write(&mailbox);
 * rewrites the header file.  Also updates the mailbox->header_file_crc
   value which will be saved when we call:

mailbox_write_index_header(&mailbox);
 * MUST be called after making any change to ensure all checksums and
   XOR_CRC are correctlly stored - along with the bumped highestmodseq
   and correct record counts.

Oh, we missed:

mailbox_append_index_records(&mailbox, &records, num);

Stores "num" records to the index file.  It's a pointer to an array of
"struct index_record" structs.

I'm still reworking everything to use these rather than doing the
bookkeeping by hand!

index.c - use a mailbox object
==============================

All the globals go away!  There's just a global mailbox struct, which
will have a lock_fd in shared mode, meaning that all the offsets into
the cyrus.index and cyrus.cache mmaps are ALWAYS VALID.

Along with compulsary CONDSTORE - vast swathes of code complexity are
being removed :)  You don't need to track times when you have modseq
to determine if flags should be sent again.  You don't have to keep
statting files, just re-read the header and then scan for updates.

It WILL be necessary to keep a global array mapping from msgno to recno
so non-UID commands can be efficient.  This can be rewritten whenever
highestmodseq is detected to have changed :)

I'm just starting on this - hence there are lots of global references to
remove.

Also - no more UID(msgno) and CACHE_OFFSET(msgno) and all that jazz.  We
read a record with mailbox_read_index_record() and then use it.  In native
form, in an easily accessible struct.  Joy.  No more 
htonl((bit32 *)base_offset+SOME_CONSTANT) nonsense everywhere!  Contain it,
wrap it, ensure accesses to it go via consistency checking, checksum
updating and header counter maintaining paths!


So - that's the status of the future branch!  It also has most of the
interesting patches from the fastmail branch cherry-picked forwards to 
sit on top of CVS trunk.  

Feel free to poke around, make suggestions, send patches, etc.  Some of
the less popular (i.e. I don't use them) utilities could probably do with
some TLC or maybe merging into uber-tools if they don't do much.  While I
haven't drifted too far incompatible just yet, it's getting that way!

Speaking of which: cyrus.expunge.  It's pretty much all stripped now.  I'll
be writing something simple in to mailbox_upgrade_index which detects its
presence and does a sorted merge of the two files into the new cyrus.index,
complete with system_flags & FLAG_EXPUNGED.  There's also FLAG_UNLINKED
which will allow us to have immediate expunge of the actual on-disk-file
(with the associated inabilithy to unexpunge of course!) if sites want that.


Thanks for reading!

Bron.


More information about the Cyrus-devel mailing list