Backup compaction optimization in a block-level replication environment
Deborah Pickett
debbiep at polyfoam.com.au
Wed Nov 6 04:20:53 EST 2019
(Sorry, that's a lot of big words. I'll try explaining what I want to do.)
On my LAN I have a Cyrus IMAP server (3.0.11), and a dedicated Cyrus
backup server (patched with Ellie's shared-mailbox and 64-bit fseek
fixes). These are connected by a nice fat link so backups happen fast
and often. A scheduled compaction occurs each morning thanks to an
event in cyrus.conf.
I now want to back up the backups to an off-site server over a much
slower link. The off-site server doesn't speak the Cyrus sync
protocol. What it does do well is block-level backups: if only a part
of a file has changed, only that part needs to be transferred over the
slow link. [I haven't decided whether my technology will be the rsync
--checksum protocol, or Synology NAS XFS replication, or Microsoft
Server VFS snapshots. They all do block-level backups well.]
Since Cyrus backup files are append-only, they should behave well with
block-level backups. But—correct me if I'm wrong—compaction is going to
ruin my day because a reduction in the size of chunk (say) 5 moves the
start offset of chunk 6 (and so on). Even if chunk 6 doesn't change
it'll have to be retransmitted in its entirety.
Right now I've set backup_compact_minsize and backup_compact_maxsize to
zero but I'm not sure if even that is sufficient to prevent chunk
offsets moving. Perhaps I need to disable the compaction event in
cyrus.conf entirely.
I really want compaction, though, or else my backups are going to get
very, very big.
Which leads me to my idea. What if compaction could be friendlier
towards block-level backups, by deliberately avoiding changing chunk
offsets in the backup file, even if that means gaps of unused bytes when
(the aforementioned) chunk 5 shrinks? It won't always work out, for
instance when a chunk grows in size. A compromise would need to be
struck between keeping chunk offsets fixed and wasted fragmented space
between chunks as they shrink.
I haven't collected enough data to know if I am making the right
assumptions about how chunk size evolves over time and how effective
compaction is at removing cruft from a backup file. Has anyone thought
about doing something like this with Cyrus backups?
--
Deborah Pickett
System Administrator
Polyfoam Australia Pty Ltd
More information about the Info-cyrus
mailing list