Funding Cyrus High Availability

Fri Sep 17 03:12:07 EDT 2004

On Fri, 17 Sep 2004, Paul Dekkers wrote:

> Date: Fri, 17 Sep 2004 08:25:26 +0200
> From: Paul Dekkers <Paul.Dekkers at surfnet.nl>
> To: info-cyrus at lists.andrew.cmu.edu
> Cc: ken at oceana.com
> Subject: Re: Funding Cyrus High Availability
> 
> Hi,
>
> Eric S. Pulley wrote:
>
>>> Question:   Are people looking at this as both redundancy and
>>> performance, or just redundance?
>> 
>> Cyrus performs pretty well already. Background redundancy would be 
>> awesome. Especially if we had control over when the syncing process 
>> occurred either via time interval or date/time.
>
> I would say not at an interval but as soon as there is an action performed on 
> one mailbox, the other one would be pushed to do something. I believe that is 
> called rolling replication.
>
> I would not be really happy with a interval synchronisation. It would make it 
> harder to use both platforms at the same time, and that is what I want as 
> well. So there is a little-bit of load-balancing involved, but more and more 
> _availability_.
>
> Being able to use both platforms at the same time maybe implies that there is 
> either no master/slave role or that this is auto-elected between the two and 
> that this role is floating...

right, but there are already tools freely available on most platforms to 
do the election and changing of the role (by switching between config 
files and restarting the master) what is currently lacking is any ability 
to do the master/slave role. once we have that it's just a little 
scripting to tie just about any failover software in to make it automatic.

one thing we need to watch out for here is that we don't set an 
impossible/unreasonable goal. don't try to solve every problem and add 
every availablity feater you can imagine all at once. instead let's look 
at the building blocks that are needed and identify what's currently not 
available.

currently we have murder which will spread the load across multiple 
machines.

currently we have many tools available to detect a server failure and run 
local scripts to reconfigure machines (HACMP on AIX, hearbeat for Linux, 
*BSD, Solaris, etc)

what we currently do not have is any ability to have one mailstore updated 
to match changes in another one.

once we have that ability there are many things that can be built by 
glueing togeather existing code. once we have a bit of experiance with 
people actually useing these features it will then be obvious which 
features need better integration with Cyrus and which make sense to remain 
seperate.

I also would not be really satisfied with interval synchronisation as the 
only choice.

I think we need something where the primary mailstore pushes a record of 
it's changes to the secondary mailstore

This can then be tweaked in several directions.

1. locking can be added so that the primary doesn't complete it's command 
until the secondary says it has a permanent record of the change 
(two-phase commit or a reasonable facimily of such)

2. batch up the changes until they hit some threshold (size or time or 
combination) and then send a batch of changes all at once

3. recongnise it's own changes to gain the ability to push updates in both 
directions at the same time (true two-phase commit with bi-directional 
replication, some horribile performance pathalogical cases, but attractive 
in some cases)

or other varients

but these all share the same common need

the ability for the master to output all it's changes and the ability for 
a slave to read such changes and update itself to match

the nice thing is that with IMAP much of the data needed is already 
output, you could do a first approximation of this with a client that 
opened a seperate connection to every folder on the primary server and 
just sat watching for server messages and whenever it saw an update send 
the matching command to the slave (fetching the full data as needed to get 
all the info). this obviously won't scale to any reasonalbe size, but this 
means that most of what's needed is already identified so the core could 
be just a common output of the exisitng messages with a little more data 
(mailbox and folder in most cases, message contents in a few)

let's get these small, but critical pieces done and then we can grow and 
experiment from there.

David Lang

> Paul
>
> ---
> Cyrus Home Page: http://asg.web.cmu.edu/cyrus
> Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
>

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html