Making Replication Robust
dpc22 at cam.ac.uk
Tue Oct 9 11:10:19 EDT 2007
On Thu, 4 Oct 2007, Bron Gondwana wrote:
> a) MUST never lose a message that's been accepted for
> delivery except in the case of total drive failure.
> b) MUST have a standard way to integrity check and
> repair a replica-pair after a system crash.
A replica system is automatically repaired to match its master, but this
doesn't help with the split brain scenarios that you are worried about.
I've never faced a spilt brain situation which involved more than two or
three messages (the outstanding log on an old master system). I suspect
that this is simply because I've never had to run an unreliable
replication engine which bails out on my production systems.
> c) MUST have a clean process to "soft-failover" to the
> replica machine, making sure that all replication
> events from the ex-master have been synchronised.
Something more than sync_shutdown_file plus automatic retries on
recent work files?
> d) MUST have replication start/restart automatically when
> the replica is available rather than requiring it be
> online at master start time.
Work in progress from Ken.
> e) SHOULD be able to copy back messages which only exist
> on the replica due to a hard-failover, handling UIDs
> gracefully (more on this later),
This is the hard one. I think that assigning a new UIDvalidity and new
UIDs for all the messages would be best as messages can then be sorted in
the replacement mailbox based on their arrival time. Actually this would
look remarkably like the new sync_combine_commit() on the replica side.
What I don't know is how we then synchronise back to the master. Up to now
the replication engine has been very careful about _not_ making changes on
the master, so that it only has the potential to mess up the spare system.
> alternatively as least
> MUST (to satisfy point 'a') notify the administrator
> that the message has different GUIDs on the two copies
> and something will need to be done about it (to satisfy
> point 'd' this must be done without bailing out
> replication for the remaining messages in the folder)
At the moment we replace messages (on the "master knows best" principle).
It would be easy enough to leave message in place and generate warnings
instead, although this would generate a lot of warnings, one for every bad
message every time that a given mailbox is updated.
> f) SHOULD keep replicating in the face of an error which
> affects a single mailbox, keeping track of that mailbox
> so that a sysadmin can fix the issue and then replicate
> that mailbox hand.
You could try disabling the MAILBOX -> USER promotion to see what happens:
the 3 x MAILBOXES retry will fix most transient problems caused by
mailboxes moving around, leaving just the permanent errors.
The MAILBOX -> USER promotion was originally there on the principle that a
mailbox disappearing under our feet was likely to appear somewhere else in
the same account (without shared mailboxes to worry about).
My nightmare scenario is a replication engine which carries on running in
the face of mboxlist corruption on the master: you could lose a lot of
mailboxes on the replica that way.
> g) MAY have a method to replicate to two different replicas
> concurrently (replay the same sync_log messages twice)
> allowing one replica to be taken out of service and
> a new one created while having no "gaps" in which there
> is no second copy alive (we use rsync, rsync again,
> stop replication, rsync a third time, start replication
> to the new site - but it's messy and gappy)
It would be easy enough to generate multiple replication log files.
MySQL keeps a single transaction log for multiple replicas, but that file
contains quite a lot of information about each transaction. In contrast
the Cyrus sync log is just a list of objects we need to pay attention to:
the files have much less state, particularly without duplicates.
David Carter Email: David.Carter at ucs.cam.ac.uk
University Computing Service, Phone: (01223) 334502
New Museums Site, Pembroke Street, Fax: (01223) 334679
Cambridge UK. CB2 3QH.
More information about the Cyrus-devel