Backup strategy for large mailbox stores
John Madden
jmadden at ivytech.edu
Mon Feb 15 10:46:46 EST 2010
>> Is there a better strategy , probably within the cyrus framework , to
>> take backups efficiently
We're a large site (400k with 1GB quotas users and growing) and this has
for years been our biggest problem too. Typical backup systems (we run
NetBackup), which scan the entire filesystem looking for changes, do not
scale. Until Cyrus uses a more efficient means of storing the data than
a single file for every message (which does have its own merits), the
problem is only going to get worse. Filesystems with 100 million files
can't be backed up in a reasonable time span and yes, the cause is the
stat() of every_single_file done during every_freaking_backup.
It's gotten to the point where I've considered writing my own filesystem
to do things more Google-filesystem-chunkservers-etc with a FUSE layer
to avoid actual Cyrus changes where backups can be done against nice
large chunks. That would be its own mess though, of course.
What Fastmail has done to fix this is really quite slick but only
applies to IMAP. We have other loads (TB-scale file servers, for
example) that will need a more generic solution. I've hatched this
wacky scheme for incrementals:
A daemon runs that monitors each filesystem you're concerned about (I
just look at /) that uses Linux's inotify to monitor the filesystem for
changes and writes each to a sqlite3 db. On backup start, that database
is consulted for a list of files that have changed and a file is written
that tells the backup agent what files to fetch. If only 10,000 files
have changed, only 10,000 files are touched. There's some windowing
logic in there to make sure you're looking at stuff changed since the
last backup was started and such, but that's the basic idea. The daemon
consumes a lot of memory because each directory has to be monitored
individually and each of our servers has about a million of them, so
that's a knock against it. It does appear to be pretty efficient
otherwise though.
That still leaves full backups as a big issue (they take days to run)
and NetBackup has a solution for that: You run one full backup and store
it on disk somewhere and from then on, fulls are called "synthetic
fulls," where the incrementals are applied periodically in snapshot
fashion and "voila, you have a full backup." After that one full
backup, the only thing you ever run is incremental. This takes 2x your
disk, but it's manageable.
John
--
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden at ivytech.edu
More information about the Info-cyrus
mailing list