Funding Cyrus High Availability
pegasus at nerv.eu.org
Fri Sep 17 06:11:00 EDT 2004
On Fri, 17 Sep 2004 08:25:26 +0200
Paul Dekkers <Paul.Dekkers at surfnet.nl> wrote:
> I would say not at an interval but as soon as there is an action
> performed on one mailbox, the other one would be pushed to do something.
> I believe that is called rolling replication.
> I would not be really happy with a interval synchronisation. It would
> make it harder to use both platforms at the same time, and that is what
> I want as well. So there is a little-bit of load-balancing involved, but
> more and more _availability_.
> Being able to use both platforms at the same time maybe implies that
> there is either no master/slave role or that this is auto-elected
> between the two and that this role is floating...
I'm jumping back into this thread a bit late ...
My feeling is that most of cyrus instalations run one or a few domains with
many users; at least that is my case. That's why i'd base any kind of
replication we come up with on the mailbox as the base unit. As raid uses
disk block for its unit, so would we use mailbox (with all its subfolders).
In a way that one would be able to take care of the whole domains on the
higher level, if needed.
Today we have the option of using murder (or perdition, with some added
logic) when more than one backend machine is needed. This brings us a kind
of "raid linear" (linux md speak) or concatenation of space into a single
mailstore. With all the 'features' of such setup: if you lose one
machine(disk), all users(data) on that machine(disk) are not available.
So what i'm thinking is we need is a kind of raid1 or mirroring of
mailboxes. Imagine user1 having its mailbox on server1 and server2, user2 on
server2 and server3, user3 on server3 and server1 ... for example. Murder is
already a central point with a knowledge of where a certain mailbox is and
how to proxy pop, imap and lmtp to it and in my way of seeing things, it
would be best to teach it how to handle this 'mirroring' too.
Let say one of the two mailboxes is primary, and the other is secondary;
murder connects to the primary, lets the client do whatever it wants and
then replays the exact same actions to the secondary mailbox. If this is
done after the primary disconnects or while the client is still talking to
the primary, is implementation detail.
Performance bonus: connect to both mailboxes at once and pronounce as
primary the one that responds faster :)
Murder would have to know how to record and playback the whole client-server
dialogues. Considering that there's already a system in cyrus that lets
admin see the 'telemetry' of the imap conversation, i guess this could be
extended and tied into murder.
So far this is just how clients would talk to our system.
What else would we need?
Certanly a mechanism to manually move mailboxes between servers in a way
that murder knows about the changes. Thinking of it, mupdate protocol
already knows how to push metadatas around; why not extend it so it can also
move mailboxes? Or should perl mupdate module be born and then some scripts
should be written with it and imap?
Then maybe some mechanism for murder to deceide on which servers to put
newly created mailboxes on. Ideally this would be plugin based with
different policies (load, disk space, responsiveness, combination of those,
something else), but a simple round robin would do for a start.
For those that do not want to have mailboxes in sync, a mechanism to delay
updates to the secondary mailbox. (In this case, which mailbox is primary
and which is secondary should not change) Also a way of handling huge piles
of backlogs in case one of the machines is down for a longer period of time.
Maybe a mechanism to sync the mailbox from the other server and discarding
the backlogs would be handy in such case. And a way to manually trigger such
resync on a specific mailbox.
Probalby something else i can't think of right now.
So how does this "cyrus in a raid view" sound? It should probalby be called
"raims" for redundand array of inexpensive mail servers anyway ;)
This way all the logic is done in one place and you only have to take good
care (in a HA sense) of the mupdate master machine. Others can remain cheap
and relatively dumb than can be pulled offline at will. Given fast enough
and reliable links, this could also work in a geographycally distributed
Ken, is something like this reasonable?
Oh, i'd like to know what fastmail.fm folks think about all this HA thing.
I'm sure they have some interesting insights :)
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
More information about the Info-cyrus