High availability email server...
Pascal Gienger
Pascal.Gienger at uni-konstanz.de
Fri Jul 28 13:40:09 EDT 2006
David Korpiewski <davidk at cs.umass.edu> wrote:
> I spent about 6 months fighting with Apple XSAN and Apple OSX mail to try
> to create a redundant cyrus mail cluster. First of all, don't try it, it
> is a waste of time. Apple states that mail on an XSAN is not supported.
> The reason is that it simply won't run. The Xsan can't handle the large
> amount of small files and will do things like disconnect or corrupting
> the file system.
STOP!
The capability to handle small files efficiently is related to the
filesystem carrying the files and NOT to the physical and logical storage
media (block device) under it.
A SAN is a network where physical and logical block devices are shared
between nodes and which makes it possible to mount a harddisk or raid
partition as a block device even if the disk is some miles away - from this
point of view there is no difference between iSCSI (IP) and FiberChannel
(FCP) besides different hardware.
For your host/mail server there is no difference between having the
filesystem on - say - a 300 GB local hard drive partition or having it on a
SAN volume routed between SAN switches. The OS "sees" it like a normal
block device, and the filesystem just uses that.
So if Apple says that Xsan does not handle many files they admit that their
HFS+ file system is crap for many small files.
Most file systems are not built to carry many small files in a directory
and all are prone to performance losses at a certain number of files in a
single directory.
There are techniques to handle these situations - for xfs (as an example)
consider having *MUCH* RAM in your machine and always mount it with
logbufs=8.
I would NEVER suggest to mount the cyrus mail spool via NFS, locking is
important and for these crucial things I like to have a real block device
with a real filesystem, so SANs are ok to me.
We are having here a RAID device with 1,5 TB wich is shared between 2 mail
nodes and 2 test nodes. The switch can be done manually (10 seconds
downtime) and - if you wish - via Heartbeat HA software. The only dangerous
thing is to ensure that NEVER, really NEVER a second node mounts your SAN
partition while another has mounted it already. Immediately kernel halts
and data losses are the result.
Pascal Gienger
More information about the Info-cyrus
mailing list