Doing Cyrus *right* for a large installation
robm at fastmail.fm
Thu Jan 15 18:49:19 EST 2009
> I would like to add a *lot* more storage so that we can increase our email
> quotas (currently 200MB per user). It seems like the proper way to scale
> up is to split the Cyrus metadata off and use some large SATA drives for
> the message files. I was considering adding a shelf of 1TB SATA drives to
> our SAN. I could store the metadata on existing FC drives on the SAN, or
> just use internal disks on the servers.
We split our meta/data onto 10k/15k RPM RAID1 for meta vs 7.2k RPM RAID5 for
data. The meta is 1/20th the size of the data. The meta drives get more data
written to them, the data drives get more data read (we have lots of memory
now, so probably lots of meta is cached). On average, utilisation for meta
is still higher than data, but they're relatively well balanced with that
split it seems.
> But then I started thinking about how I was going to backup all this new
> data... Our backup administrator isn't too excited about trying to backup
> 12TB of email data.
We backup to a X4500 server. Bron built our custom backup system for cyrus.
Each cyrus machine has a backup daemon that has a simple network protocol.
The daemon knows where meta and data files are, and can read and understand
cyrus.* files. A backup process on the X4500 runs each day and connects to
the daemon on each cyrus machine and uses it to find out changes for each
user for each folder and updates the backup on the X4500. All backups are
stored in .tgz streams with a copy of every email, every cyrus.* file. Meta
data is stored in an sqlite file.
In general the backup process just appends to the .tgz stream. When it
calculates that the ratio of "old" data in the .tgz is too high, it re-packs
the whole thing removing the old data.
The whole thing relies a lot on internal knowledge of our setup, so it's not
something we can easily release unfortunately.
> What if we used Cyrus replication in combination with delayed expunge as a
> form of "backup"? We currently only keep 1 month of daily backups
It's an option, but it's a bit scary still. What if there's a replication
protocol error that blows away your replica? Unlikely, but possible.
I think we might be a bit paranoid. We don't like relying on any one thing.
Filesystems, software, hardware, etc. Net result is we've ended up with
quite a few levels of redundancy.
1. All data on RAID so any HD failure is just a replacement HD no downtime
2. Delayed delete, so any user deletion error can be fixed by re-inserting
the deleted messages
3. All data replicated, so any server/storage unit failure is just switching
4. Nightly backups to a completely separate server, with different OS and
filesystem, and with no shared credentials or trust. Basically a last resort
in case of major hardware/OS/security screw up that you absolutely hope you
never have to use.
More information about the Info-cyrus