Backup strategy for large mailbox stores

Tue Feb 16 17:26:06 EST 2010

Clement Hermann (nodens) wrote:
> The snapshot approach (we use ext3 and lvm, soon ext4) is promising, as 
> a simple tar is faster than using the full backup suite on a filesystem 
> with a lot of small files (atempo here). But you need the spare space 
> locally, or you need to do it over the network, and it will take time 
> (but won't probably kill disk I/O as much as the backup software).
>
>   
That is one of the strengths of COW.   Snaps are atomic
and are simply pointers into a block structure that is no
longer writeable, it is not the same as snapshots of old when a
100G filesystem needed another 100G space for the snap.

ZFS snapshot only "cost" the difference of the changes.
As I said, 14 days of snapshot imposes on our servers an
approximate 50% overhead so out of a terabyte pool 200G
is the current dataset and 100 gigs is overhead for the snaps.
Which is still only 300 gigs out of a terabyte.

I'd be tempted to up the snapshot limit but that would
require a bunch of meetings as the retention policy for
ages here has been 2 weeks and what's the point in
changing it?  Everyone is moving to Google eventually
anyhow so it hardly seems worth messing with a system
that's been rock-solid for 3 years but is being retired.

One weakness of COW which you'll probably see in BTRFS
is fragmentation on very full filesystems with frequent
changes becomes a problem.

We never run over 30% and filesystems up for 3 years are
still perfectly fine.  But if you abuse the limits  it may
be a problem for you.  Background defrag for ZFS is in the
pipe and I hope BTRFS will have something similar.