Cyrus crashed on redundant platform - need better availability?

Sebastian Hagedorn Hagedorn at uni-koeln.de
Wed Sep 15 05:18:40 EDT 2004


Hi,

--On Freitag, 10. September 2004 16:27 Uhr +0200 Paul Dekkers 
<Paul.Dekkers at surfnet.nl> wrote:

>> Right, works fine for us for the most part. Hasn't always been like
>> that, but the most recent kernel updates by Red Hat have improved
>> matters a lot.
>
> What did the kernel improve?

memory management for the most part. With 8 GB of RAM and lots of it free 
there were previously situations where either the cache grew too large, 
causing the machine to become extremely slow, or where forks failed (even 
though there were oodles of free RAM). Both seem to have been resolved in 
2.4.9-e.49enterprise.

> You are not using a clustered filesystem,
> right?

No.

>>> Although many on the list claim that this (having 2 boxes with 1
>>> disk-array) is a nice way for redundancy I'm in doubt now if this is
>>> true.
>>
>> It's good but not perfect. We recently installed a huge SAN and are
>> now in the process of moving over the mail data to reside there.
>> Fibrechannel seems to be much more error tolerant than SCSI.
>
> Hmm, I don't expect the problems to be SCSI-related. Maybe it has to do
> with GEOM and SMP in FreeBSD 5.2.1, but not the SCSI-bus itself. (There
> are two seperate controllers for both machines, they never see each other
> on the same SCSI bus...)

That's not what I was talking about. We have a similar setup, yet still 
there were instances when Red Hat's cluster software failed to write to the 
shared storage. I guess this was caused by the slow-downs connected to the 
memory management, but Red Hat support indicated that shared storage 
connected via FibreChannel would not have been as susceptible to these 
problems.

--On Freitag, 10. September 2004 21:36 Uhr +0200 "Jure PeÄ?ar" 
<pegasus at nerv.eu.org> wrote:

> The kernel that shipped with RedHat AS 2.1 was useless for most of the
> tasks i tried it with. About three revisions later it became somewhat
> more usefull for non-oracle types of use, but i've rolled my own and am
> not following the state of it now.

That's fine if you don't have to rely on commercial support. Our management 
decided to go the supported path all the way. That doesn't leave you many 
options. I have to say that when it works, the cluster software works 
extremely well. It's just that it hasn't always worked in the past ... ;-)

> I haven't had problems with the fiber itself, i've only had lots of fun
> with the firmware on the disks themselves and some with the qlogic
> drivers.

We've had our share of problems with those as well, but I hear that Red Hat 
AS 3.0 ships with working QLogic drivers that work out of the box.

Cheers, Sebastian Hagedorn
--
Sebastian Hagedorn M.A. - RZKR-R1 (Gebäude 52), Zimmer 18
Zentrum für angewandte Informatik - Universitätsweiter Service RRZK
Universität zu Köln / Cologne University - Tel. +49-221-478-5587
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
Url : https://lists.andrew.cmu.edu/mailman/private/info-cyrus/attachments/20040915/b17e35e5/attachment.bin


More information about the Info-cyrus mailing list