Funding Cyrus High Availability
pegasus at nerv.eu.org
Sun Sep 19 18:43:34 EDT 2004
On Sun, 19 Sep 2004 00:52:08 -0700 (PDT)
David Lang <david.lang at digitalinsight.com> wrote:
Nice review of replication ABC :)
Here are my thoughts:
> 1. Active->Slave replication with manual failover
This is really the simplest way to do it. Rsync (and friends) does 90% of
the required job here; the only thing it's lacking is the concept of the
"mailbox" as a unit. It would be nice if our daemon here would do its job in
an atomic way.
A few days ago someone was asking for an event notification system that
would be able to call some program when a certain action happened on a
mailbox. Something like this would come handy here i think :)
> 2. Active->Slave replication with automatic failover
2 is really just 1 + your heartbeat package of choice and some scripts to
tie it all together.
> 3. Active->Slave replication with Slave able to accept client connections
I think here would be good to start thinking about the app itself and define
"connections" better. Cyrus has three kinds of "connections" that modify a
mailbox: lmtp that puts new mails into mailbox, pop that (generally)
retrieves (and delete) them and imap that does both plus some other (folder
ops and moving mails around).
Now if you deceide that it does not hurt you if slave is "a bit" out of date
when it accepts a connection (but i guess most of us would find this
unacceptable), you can ditch some of the complexity; but you'd want the
changes that were made on the slave in that connection to propagate up to
the master. I dont really like this, because the concepts of master and
slave gets blurred here and things can easily end in a mess.
Once you have mailstores that are synchronizing each other in a way that is
not very well defined, you'll end up with conflicts sooner or later. There
are some unpredictable factors like network latency that can lead you to
unexpected situations easily.
> 4. #3 with automatic failover
Another level of mess over 3 :)
> 5. Active/Active
> designate one of the boxes as primary and identify all items in the
> datastore that absolutly must not be subject to race conditions between
> the two boxes (message UUID for example). In addition to implementing the
> replication needed for #1 modify all functions that need to update these
> critical pieces of data to update them on the master and let the master
> update the other box.
Exactly. This is the atomicy i was mentioning above. I'd say this is going
to be the larger part of the job.
> 6. active/active/active/...
This is what most of us would want.
> while #6 is the ideal option to have it can get very complex
Despite everything you've said, i still think this *can* be done in a
relatively simple way. See my previos mail where i was dreaming about the
whole ha concept in a raid way.
There i assumed murder as the only agent through which clinets would be able
to access their mailboxes. If you think of murder handling all of the jobs
of your daemon in 1-4, one thing that you gain immediately is much simpler
synchronization of actions between the mailstore machines. If you start
empty or with exactly the same data on two machines, all that murder needs
to do is take care that both receive the same commands and data in the same
Also if you put all logic into one place, backend mailstores need not to be
taught any special tricks and can remain pretty much as they are today.
Or am i missing something?
> personally I would like to see #1 (with a sample daemon or two to provide
> basic functionality and leave the doors open for more creative uses)
> followed by #3 while people try and figure out all the problems with #5
> and #6
and i would like to see that we come here to a conclusion of what kind of ha
setup would be best for all and focus our energy on only one implementation.
I have enough old hardware here (and i'm getting some more in about a month)
that i can setup a nice little test environment. Right now it also looks
like i'll have plenty of time in the february - june 2005 so i can volunteer
to be a tester.
> there are a lot of senerios that are possible with #1 or #3 that are not
> possible with #5
One i think is slave of a slave of a slave (...) kind of setup. Does anybody
really need such setup for a mail? I understand it for a ldap for example,
there are even some things where it is usefull for a sql database, but i see
no reason to have it for a mail server.
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
More information about the Info-cyrus