Wiki, information on database back ends.

Bron Gondwana brong at fastmail.fm
Tue Sep 14 18:46:06 EDT 2010


On Tue, Sep 14, 2010 at 02:24:17PM -0500, Patrick Goetz wrote:
> On 09/14/2010 12:26 PM, Dave McMurtrie wrote:
> >
> >The Cyrus wiki's content has been mostly moved to
> >http://www.cyrusimap.org/ except for what I considered to be useless or
> >outdated content.
> >
> 
> Hmmm, the take away message is the wiki is rather light on useful,
> timely content.  <:)
> 
> There's been some discussion on the Debian cyrus list about how to
> automate upgrades from cyrus 2.n.k to cyrus 2.m.j.  Jeroen van
> Meeuwen (on both lists) suggested that the cyrus RPM package
> features a utility called cyrus-imapd.cvt_cyrusdb_all which might be
> useful for this.

Yeah - we were talking about this the other day on instant messaging,
Jeroen and I!  I've written something a bit nicer.  Basically, I
ripped out the "guts" of cvt_cyrusdb and stuck it in lib/cyrusdb.c.
Then I wrote a "detect" function that checks that magic and figures
out if the file is berkeley, berkeley-hash or skiplist from its magic.
Then for each file it checks if the type matches the configuration
value, and converts if it doesn't.

This is run during ctl_cyrusdb -r during startup.
 
> I've been looking at this script, and it mostly appears to be using
> cvt_cyrusdb to convert particular db files to Cyrus skiplists and
> then back again to the original db backend format.  I can't follow
> the script completely as it seems to rely on DB configuration
> details found in the imapd.conf file I don't have in my Debian
> 2.1.16 imap server, and it's also not clear how the script is run.
> 
> This raises a number of questions, though:
> 
> 1.
> Cyrus skiplists?  I thought all the DB files were in Berkeley DB
> format.  I tried to find some documentation on skiplists, but only
> found an old message to the developer list from Bron Gondwana
> discussing skiplist bugs
> (http://markmail.org/message/zbaq765brbg2acfj).

Yes, Cyrus Skiplists.  It's a DB format written entirely inside
Cyrus.  They're quite stable now.  The only real downside is that
the lock is global per database - they don't have any concept of
row locking, so concurrency can suffer.  This usually isn't a
big problem.  At FastMail we've had ALL our databases in skiplist
for a couple of years now.

> On the other hand, this guy talks about converting all Berkeley DB
> files to skiplists because of perceived libdb bugs:
> http://www.mail-archive.com/info-cyrus@lists.andrew.cmu.edu/msg31953.html

I'm currently trying to find someone (either inside Opera or elsewhere)
to help me debug Cyrus' use of BDB and see if we can do it better.
I suspect the BDB problems are more with how we're using it as
with BDB itself.

> Skiplists: what are they, when and why use them?  Either I'm a bad
> googler or documentation seems to be lacking.

lib/cyrusdb_skiplist.c - knock yourself out :)

They're very good for sequential reads - "foreach" and friends.  It's
a very lightweight format, which provides pretty good locality of
data - so it's fairly cache friendly.

> 2.
> The Redhat cvt_cyrusdb_all script seems to assume a specific set of
> database files.  Is the set of cyrus imap DB files fixed, and if so
> what are they?  Is there any documentation on what each database
> file contains? This would be very useful to people trying to convert
> older cyrus IMAP installations to new ones.

Pretty much, yes.  There are a handful of files - plus the per user
seen, sub and quota files.  Seen are skiplist and sub is flat file.
Quota is its own special format.  Here's the listing of the main
databases:

dblist[] = {
    { FNAME_MBOXLIST,>-->-------&config_mboxlist_db,>---1 },
    { FNAME_QUOTADB,>--->-------&config_quota_db,>------1 },
    { FNAME_ANNOTATIONS,>-------&config_annotation_db,>-1 },
    { FNAME_DELIVERDB,>->-------&config_duplicate_db,>--0 },
    { FNAME_TLSSESSIONS,>-------&config_tlscache_db,>---0 },
    { FNAME_PTSDB,>----->-------&config_ptscache_db,>---0 },
    { FNAME_STATUSCACHEDB,>-----&config_statuscache_db,>0 },
    { NULL,>---->------->-------NULL,>-->------->-------0 }
};

The only three you really need to care about are mboxlist,
quota and annotations - and of those, quota probably doesn't
exists if you've got "legacyquota".  By legacy I mean, we use
it - because it's less lock contention and more reliable.

Anyway.  Discard the ones with '0' in the archive value,
because they're just caches and the format has probably
changed anwyay - but upgrade your mboxlist and annotation
files.

Skiplist hasn't changed format in approximately forever.
I have considered upgrading it (mainly to add some more
internal integrity checks), but the benefits haven't
outweighed the costs yet.  I did write a skiplist-2 file
format at one point and start playing with it, but that
was years ago.

> 3.
> The dicussion of DB backends leads one to wonder if this means
> Berkeley DB or skiplists, or if other backends are used, too?  Is
> there any documentation on this?

There's flat - and Ken added some SQL support (sqlite,
mysql and postgresql) a little while back, though I
haven't tested it yet.

No, there's not much documentation.  I'm working on
fixing that too.  I wrote up an outline of what I want
to document on the old wiki - not sure if it's been
ported across, but I have a copy in my email as well.
I'll paste it below.

Bron.

====================================================

Here's an overview of what needs to be documented.



---++ On Disk Format



   * mailbox

      * cyrus.header

      * cyrus.index

      * cyrus.cache

      * cyrus.squat (stub for now)

   * message files (rfc822)

   * file naming

      * dir hashing algorithms
      * config variables (including partitions)

      * domain split

   * db subformats

      * quota

      * seen

      * sub

      * mboxlist

      * deliver

      * annotations

      * statuscache

   * sieve

   * sync log files

   * proc files

   * "special" - shutdown, etc.

   * db formats: skiplist, flat, berkeley, quotalegacy



---++ Locking



   * name locks

   * cyrus.index locks

   * deadlock prevention



---++ Index API



   * how it works

   * how the "client view" is kept in sync



---++ Replication



   * wire format (dlist)

   * full protocol overview

   * locking considerations

   * sync_crc - calculation and purpose

   * split brain recovery



---++ Reconstruct



   * how it works now

   * flags and purpose (also, man page)



---++ mbdump



   * still needs to be rewritten to use dlist!

   * incremental dumps



---++ Internal APIs



   * seqset_

   * buf_

   * charset_

   * prot_

There's lots of stuff that needs to be either documented
or updated to make Cyrus development viable for people
who aren't Bron right now.  Lots has changed!



More information about the Cyrus-devel mailing list