Backup strategy for large mailbox stores
vbfox at ucdavis.edu
Tue Feb 16 17:26:06 EST 2010
Clement Hermann (nodens) wrote:
> The snapshot approach (we use ext3 and lvm, soon ext4) is promising, as
> a simple tar is faster than using the full backup suite on a filesystem
> with a lot of small files (atempo here). But you need the spare space
> locally, or you need to do it over the network, and it will take time
> (but won't probably kill disk I/O as much as the backup software).
That is one of the strengths of COW. Snaps are atomic
and are simply pointers into a block structure that is no
longer writeable, it is not the same as snapshots of old when a
100G filesystem needed another 100G space for the snap.
ZFS snapshot only "cost" the difference of the changes.
As I said, 14 days of snapshot imposes on our servers an
approximate 50% overhead so out of a terabyte pool 200G
is the current dataset and 100 gigs is overhead for the snaps.
Which is still only 300 gigs out of a terabyte.
I'd be tempted to up the snapshot limit but that would
require a bunch of meetings as the retention policy for
ages here has been 2 weeks and what's the point in
changing it? Everyone is moving to Google eventually
anyhow so it hardly seems worth messing with a system
that's been rock-solid for 3 years but is being retired.
One weakness of COW which you'll probably see in BTRFS
is fragmentation on very full filesystems with frequent
changes becomes a problem.
We never run over 30% and filesystems up for 3 years are
still perfectly fine. But if you abuse the limits it may
be a problem for you. Background defrag for ZFS is in the
pipe and I hope BTRFS will have something similar.
More information about the Info-cyrus