Status of Cyrus replication

Bron Gondwana brong at fastmail.fm
Fri May 23 06:16:41 EDT 2008


On Thu, May 22, 2008 at 02:17:02PM -0500, Blake Hudson wrote:
> Hey all, last time I checked replication was undergoing major overhauls 
> and incompatibility between minor versions of 2.3.x was pretty great. 
> There were also a few bugs that could potentially cause trouble down the 
> road. I've had the need to create setups with failover servers and have 
> continued using rsync on an interval (~30 to 60 min) for this purpose. 
> Unfortunately this causes quite a lot of IO load on the servers and I 
> was hoping that a rolling replication setup would help resolve this.

Yeah, it would!  Are you using rsync 3.0?  It doesn't help with the IO
load, but at least it's a bit more incremental about things.

Also, you can get huge performance wins with a tiny bit of custom code,
something like this hunk of untested perl:

while (readdir(DH)) {
  if (m/^cyrus\./) {
    # rsync this file, could have changed arbitrarily
  }
  elsif (m/^\d+\.$/) {
    # this is a cyrus message file, if it exists on the replica then
    # no need to try and sync
  }
  elsif (! m/^\./) {
    # this is a subfolder, sync it.
  }
}

Basically, you don't need to stat the message files, which are the
bulk of your data.

... but that's still a lot of custom protocol development and stuff.
Annoying.

> What's the status of Cyrus replication in the latest releases of 2.3.x - 
> specifically with virtual domains enabled?

It's getting pretty good actually.  Most of our replication errors
for the last couple of weeks have been traced back to a bug in our
automated user-move code, which meant it failed to add a "USER $foo"
to the sync log after moving users to new servers - so moved users
who had no activity were not replicated.
 
> It also seems like there have been some problems with the latest 
> releases of 2.3 and I'm hesitant to upgrade my 99% working 2.3.1 
> install. Any lingering issues or reason not to upgrade?

There were some bad times in there.  The only outstanding bug I'm
aware of in 2.3.12 is the blank lines in config file segfault -
you'll either see that straight away or not at all!

> For those who have the need to create a "hot spare" server and are not 
> using Cyrus replication, what method are you guys using to accomplish 
> this goal?

Our backup system (not quite the same!) uses a perl module which
reads the folder records from mailboxes.db and then uses fcntl locks
on the cyrus.* files in each folder to block out cyrus while it
streams the cyrus.* files.  These are then backed up, and also
parsed to see what message files are indexed - this is compared
against what has already been fetched, and any new messages are
also fetched and stored.  It's blindingly quick through intimate
knowledge of Cyrus's internals.

In the best case, no matter how big the folder, it costs only two
stats (cyrus.header and cyrus.index, we don't bother backing up
cyrus.cache since it's all derived information).  If either of
them has changed we stream the contents of them both.  Only then
if there are new message files do we cause any IO on the data
partition, and that is direct filename opens.  No readdirs ever.

Bron.


More information about the Info-cyrus mailing list