High availability email server...

Pascal Gienger Pascal.Gienger at uni-konstanz.de
Fri Jul 28 13:40:09 EDT 2006


David Korpiewski <davidk at cs.umass.edu> wrote:

> I spent about 6 months fighting with Apple XSAN and Apple OSX mail to try
> to create a redundant cyrus mail cluster.  First of all, don't try it, it
> is a waste of time.  Apple states that mail on an XSAN is not supported.
> The reason is that it simply won't run.   The Xsan can't handle the large
> amount of small files and will do things like disconnect or corrupting
> the file system.

STOP!
The capability to handle small files efficiently is related to the 
filesystem carrying the files and NOT to the physical and logical storage 
media (block device) under it.

A SAN is a network where physical and logical block devices are shared 
between nodes and which makes it possible to mount a harddisk or raid 
partition as a block device even if the disk is some miles away - from this 
point of view there is no difference between iSCSI (IP) and FiberChannel 
(FCP) besides different hardware.

For your host/mail server there is no difference between having the 
filesystem on - say - a 300 GB local hard drive partition or having it on a 
SAN volume routed between SAN switches. The OS "sees" it like a normal 
block device, and the filesystem just uses that.

So if Apple says that Xsan does not handle many files they admit that their 
HFS+ file system is crap for many small files.

Most file systems are not built to carry many small files in a directory 
and all are prone to performance losses at a certain number of files in a 
single directory.

There are techniques to handle these situations - for xfs (as an example) 
consider having *MUCH* RAM in your machine and always mount it with 
logbufs=8.


I would NEVER suggest to mount the cyrus mail spool via NFS, locking is 
important and for these crucial things I like to have a real block device 
with a real filesystem, so SANs are ok to me.

We are having here a RAID device with 1,5 TB wich is shared between 2 mail 
nodes and 2 test nodes. The switch can be done manually (10 seconds 
downtime) and - if you wish - via Heartbeat HA software. The only dangerous 
thing is to ensure that NEVER, really NEVER a second node mounts your SAN 
partition while another has mounted it already. Immediately kernel halts 
and data losses are the result.

Pascal Gienger


More information about the Info-cyrus mailing list