Backup compaction optimization in a block-level replication environment

Deborah Pickett debbiep at polyfoam.com.au
Fri Nov 15 00:12:21 EST 2019


On 2019-11-11 11:10, ellie timoney wrote:
>>> This setting might be helpful:
>> Thanks, I saw that setting but didn't really think through how it would
>> help me.  I'll experiment with it and report back.
> That would be great, thanks!

Progress report: I started with very large chunks (minimum 64 MB, 
maximum 1024 MB) and a threshold of 8 chunks but I found that compaction 
was running every time, even on a backup file that hardly changed.  Not 
certain why this would be; my current theory is that in chunks that size 
there is almost always some benefit to compacting, so the threshold is 
passed easily.  There were 41 chunks in my %SHARED backup.

I'm now trying very small chunks (no minimum size, maximum 128 kB) with 
varying thresholds.  This is probably _too_ small (smaller than even 
some messages).  I'll bisect the difference and see if there is a sweet 
spot.

I've settled on rsync as the transport protocol for sending the backups 
off site.  Its rolling-checksum algorithm means that even if a chunk 
moves within the file it can still be transmitted efficiently, saving me 
from needing all that fragmentation guff I started this thread with.

Related: I had to apply the patch described in 
(https://www.mail-archive.com/info-cyrus@lists.andrew.cmu.edu/msg47320.html), 
"backupd IOERROR reading backup files larger than 2GB", because during 
initial population of my backup, chunks tended to by multiple GB in size 
(my %SHARED user backup is 23 GB, compressed).  Will this patch be 
merged to a main line?

-- 
*Deborah Pickett*
System Administrator
*Polyfoam Australia Pty Ltd*



More information about the Info-cyrus mailing list