Backup compaction optimization in a block-level replication environment

Deborah Pickett debbiep at polyfoam.com.au
Wed Nov 6 04:20:53 EST 2019


(Sorry, that's a lot of big words.  I'll try explaining what I want to do.)

On my LAN I have a Cyrus IMAP server (3.0.11), and a dedicated Cyrus 
backup server (patched with Ellie's shared-mailbox and 64-bit fseek 
fixes).  These are connected by a nice fat link so backups happen fast 
and often.  A scheduled compaction occurs each morning thanks to an 
event in cyrus.conf.

I now want to back up the backups to an off-site server over a much 
slower link.  The off-site server doesn't speak the Cyrus sync 
protocol.  What it does do well is block-level backups: if only a part 
of a file has changed, only that part needs to be transferred over the 
slow link.  [I haven't decided whether my technology will be the rsync 
--checksum protocol, or Synology NAS XFS replication, or Microsoft 
Server VFS snapshots.  They all do block-level backups well.]

Since Cyrus backup files are append-only, they should behave well with 
block-level backups. But—correct me if I'm wrong—compaction is going to 
ruin my day because a reduction in the size of chunk (say) 5 moves the 
start offset of chunk 6 (and so on).  Even if chunk 6 doesn't change 
it'll have to be retransmitted in its entirety.

Right now I've set backup_compact_minsize and backup_compact_maxsize to 
zero but I'm not sure if even that is sufficient to prevent chunk 
offsets moving.  Perhaps I need to disable the compaction event in 
cyrus.conf entirely.

I really want compaction, though, or else my backups are going to get 
very, very big.

Which leads me to my idea.  What if compaction could be friendlier 
towards block-level backups, by deliberately avoiding changing chunk 
offsets in the backup file, even if that means gaps of unused bytes when 
(the aforementioned) chunk 5 shrinks?  It won't always work out, for 
instance when a chunk grows in size. A compromise would need to be 
struck between keeping chunk offsets fixed and wasted fragmented space 
between chunks as they shrink.

I haven't collected enough data to know if I am making the right 
assumptions about how chunk size evolves over time and how effective 
compaction is at removing cruft from a backup file.  Has anyone thought 
about doing something like this with Cyrus backups?

-- 
Deborah Pickett
System Administrator
Polyfoam Australia Pty Ltd


More information about the Info-cyrus mailing list