cyrus replication : master to replica and replica to master

Thu Oct 22 20:35:25 EDT 2009

> Client A: upload message to Inbox, gets UID 100
> At the same time, Client B: upload message to Inbox, gets UID 100
>
> You can't have two messages with the same UID.
>
> There's 3 solutions I can see:
>
> 1. Mysql solves this by having interleving id's on separate servers (eg.
>   auto-increment column on server A is odd numbers, on server B it's
>   even numbers). I guess you could in theory do the same with IMAP
>   (though I'd have to double check the spec), but it would create
>   really annoying UID lists because you basically lose the ability to
>   use things like 30:50. One other option would be to alternate in
>   100's or something like that (eg. 1-100 on s1, 101-200 on s2, etc)

I realised another potential problem with this.

The IMAP spec says UIDs must be incremental. So if you upload a message to 
s1 and it gets UID 100, and upload on s2, and it gets UID 200, then when you 
upload the next message on s1, it has to get UID 300. So you have to make 
sure that if any UIDs are allocated in a "higher range", you have to jump to 
the next range. This could cause you to run out of UIDs quite quickly in 
pathalogical back and forth cases (eg 2 IMAP clients connected separately to 
s1 & s2 both uploading a bunch of messages to the same folder).

> 3. Use some conflict resolution strategy. If some client uploads UID 100
>   on s1, and another uploads UID 100 on s2, then when the conflict is
>   noticed, both sides have to delete + expunge the message (because
>   different IMAP clients might have different ideas on what message UID
>   100 is) and create new UIDs 101 and 102 with the two messages. This
>   can be messy because if a POP client is connected, you can't alter
>   the mailbox at all because the message list isn't allowed to change
>   under the POP clients feet, so connected POP clients could cause
>   nasty locking issues.

As an FYI, this is basically what we currently do with active/passive 
replication if we have to do a "hard" failover, but we have to do it all by 
hand.

In a controlled failover, we make sure all the sync logs are correctly 
played before switching roles, so we know master/replica are in sync when we 
change roles.

However if a server hard fails (eg kernel panic or some other OS lockup), 
then we may have to switch roles, without knowing if the logs have been 
cleanly played. When we get the dead server back online, it may be that a 
message with the same UID exists on both ends.

In the current cyrus code, the master wins, and overwrites the message on 
the replica. That's dangerous, and might end up destroying messages. We have 
a patch for cyrus that checks for this case, and compares the GUID of the 
messages (basically SHA1 of message content), and if they're different, it 
refuses to overwrite the message, and instead just logs a message to syslog.

We have another script that notices those syslog messages, and emails us.

We then have yet another script, that lets us inspect the offending message 
on both sides (eg master and replica), to see what the situation is. In most 
cases, it's a message that was clearly delivered to one side, but not 
replicated to the other before the machine crashed. We can then give a flag 
to the script that makes it delete the UID on both sides, and then 
re-appends both messages to the current master server, causing both messages 
to get new UIDs.

Ideally, this whole process would be automated, and at some point we 
probably will make our scripts do it automatically, but it would probably be 
nice if this was eventually pushed down into the cyrus replication protocol 
somehow. As Bron noted, there would still be lots of other things to make 
active-active replication work properly (thinking about folder renames while 
someone else has a folder selected on the other server really scares me), 
but at least this would deal with the most common active-passive case where 
a non-graceful failover occured.

Rob