choosing a file system

Bron Gondwana brong at fastmail.fm
Sat Jan 3 05:45:47 EST 2009


On Sat, Jan 03, 2009 at 11:46:41AM +0530, ram wrote:
> Running multiple cyrus instances with different dbs ? How do we do that.
> I have seen the ultimate io-contention point is the mailboxes.db file.
> And that has to be single. 

Yeah, mailboxes.db access kinda sucks like that.  If you're making any
changes then it locks the entire DB with a single writelock.

I did consider fine-grained mailboxes.db locking at one point.  It's
definitely doable with fcntl locking, which is what Cyrus is using on
our machines.  It would require some small format changes to skiplist
though.  Somewhere in a checkout I have cyrusdb_skiplist2.c which
contains a bunch of checksumming code and the start of the new format.
I got sidetracked and never finished it though.

All our cyrus instances are installed on completely different drives.
Entirely self-contained on those external units so we can plug then
into a new machine and go.  The init scripts are in /etc/init.d/, but
they are generated from templates which pull their configuration from
a central file.  We can create a new pair of cyrus instances by adding
a single line that looks like this in a config file:

store$n   slot$s1   slot$s2

where $n, $s1 and $s2 are just numbers.  Slots are numbered as
%d%02d with server and partition numbers (it will break if we ever
have over 100 slots on a machine, but I'm happy to renumber at that
point.  Our biggest so far is 40.  When I set this up the biggest was
8.  Future proofing something so easily reconfigurable would have just
meant more typing in the meanwhile.

> Do you mean dividing the users to different cyrus instances. That is a
> maintenance issue IMHO. 

It's amazing what you can do with good tools - besides, if your site is
already bigger than any one single machine then you already have the
issue.  Might as well be smart about it.

As I said upthread somewhere - moving a user is pretty easy for us:

use ME::User;

my $UserName = shift;
my $TargetServer = shift;

my $User = ME::User->new_find($UserName);
$User->MoveUser($TargetServer);
 
> But the new Solid-State-Disks seem very promising. They are claimed to
> give 30x the throughput of a 15k rpm disk. If IO improves by 30 times
> that should make all these optimizations unnecessary. 
> As my boss used to tell me ... Good hardware always compensates for
> not-so-good software. 

Yeah, that would be nice.  Modulo the rewrite cost of course.  Note that
mailboxes.db is a skiplist file.  They make a lot of random updates to
4 bytes at a time when you append a record.  Imagine what that costs if
your minimum rewrite block is larger than the size of the whole file.  
You'd be better off going to flatfile DB.  I'm not kidding you here.
Running "recovery" at startup time would take days on a reasonable
sized DB.  Check out the seeks and rewrites that baby does.  (ok, so if
your filesystem isn't mounted writeback it would probably only rewrite
twice when you actually did the fsyncs.  So much for rhetorical devices)

Bron ( rambling again )


More information about the Info-cyrus mailing list