Cyrus crashed on redundant platform - need better availability?
david.lang at digitalinsight.com
Wed Sep 15 11:56:52 EDT 2004
also take a look at the heartbeat package at linux-ha.org This works on
linux, *BSD, and solaris (there were people working on a AIX port, but
they apparently dropped it shortly before finishing)
On Wed, 15 Sep 2004,
Jure [UTF-8] PeÄ~Mar wrote:
> Date: Wed, 15 Sep 2004 17:07:20 +0200
> From: "Jure [UTF-8] PeÄ~Mar" <pegasus at nerv.eu.org>
> To: Paul Dekkers <Paul.Dekkers at surfnet.nl>
> Cc: Hagedorn at uni-koeln.de, info-cyrus at lists.andrew.cmu.edu
> Subject: Re: Cyrus crashed on redundant platform - need better availability?
> On Wed, 15 Sep 2004 13:38:43 +0200
> Paul Dekkers <Paul.Dekkers at surfnet.nl> wrote:
>> But I suppose RH's cluster manager takes care of mounting the partitions
>> and checking them if there are any errors.
> Not really, at least not by itself. See
> http://people.redhat.com/jrfuller/cms/ for detailed documentation of what is
> included with RH AS 2.1 (it's some $500 extra for AS 3).
> I had to write some pretty paranoid scripts that take care of assembling
> software raids, checking the fs and mountig it while taking care about the
> other machine to prevent problems.
> Of course all this would be much easier with some kind of clustered fs, but
> clustered fs brings a new problem: locking. Almost all i've seen so far have
> an external 'locking manager' on a separate box, which brings ethernet
> latency into every lock operation, which i'm sure is very noticable in the
> lock-heavy usage patterns as mail is. But this is just my feeling, i haven't
> yet benchmarked any :)
>> Do you think using RH's cluster software is a valuable consideration for
>> this kind of clustering setup? Using FreeBSD there are not that many
>> clustering solutions for now, and if it's advisable to at least consider
>> using RH here (although I have no experience with RH) we can certainly
>> look at it. (Any idea how fast RH would "recover services"?)
> This RH cluster software is nothing fancy; i'm sure equivalents exists for
> BSDs. See documentation link above. Actually it is just Kimberlite
> (http://oss.missioncriticallinux.com/projects/kimberlite/), sold with RedHat
> "Speed" of recovery is almost completely out of the cluster control. The
> only thing that matters for the cluster is what your cyrus init script
> returns when called with 'status' parameter. Everything else is up to your
> init scripts.
> Of course, if one box dies completely, the other takes over in the
> configurable time.
> Jure Peÿÿar
> Cyrus Home Page: http://asg.web.cmu.edu/cyrus
> Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
-- C.A.R. Hoare
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
More information about the Info-cyrus