load balancing at fastmail.fm

urgrue urgrue at bulbous.org
Mon Feb 12 14:52:36 EST 2007


 

David Carter wrote:
> Why do you need NFS?
>
> The whole point of a SAN is distributed access to storage after all :).

SAN distributes the disk, not the filesystem. I presume in this case hes 
not using the SAN for its multiple-client-access features but just 
because its fast/reliable.

>
> Some of my colleagues who run a SAN have had no end of grief. At which 
> point you are dependant on the abilities of the vendor to diagnose and 
> fix problems. It was this experience that encouraged me to try 
> application level replication with lots of small servers in the first 
> place. At least that way I can keep a close eye on what the various 
> copies are up to.

SAN really has nothing to do with replication. You have your data 
somewhere (local or external disks, local/ext raid, NAS, SAN, etc), and 
youve got your various replication options (file-level, block-level, via 
client, via server, etc).
None of these are a replacement for backups.

>
> A SAN doesn't protect you if your filesystem decides to explode: I 
> believe that Fastmail have direct experience of this. Two independent 
> copies of the data allows you to keep running a service for the hours 
> that an fsck typically takes to complete with file per msg stores on 
> large modern disks. It also means rather less stress if the fsck fails 
> to complete.

Fastmail dont use SAN, as I understand they use external raid arrays.
There are many ways to lose your data, one of these being filesystem 
error, others being software bugs and human error. Block-level 
replication (typically used in SANs) is very fast and uses few resources 
but doesnt protect from filesystem error (although it can offer instant 
recovery). File-level replication is somewhat more resilient and easier 
to monitor, but is just as prone to human errors, bugs, 
misconfigurations, etc.

> I've heard horror stories about all the common Linux filesystems and 
> I've personally watched fsck.ext3 (supposedly the safest option) 
> unravel a filesystem, with thousands of entries left in lost+found. 
> ZFS looks nice.
>

There will be horror stories for every given system in the world. 
Generally speaking ext3 is very reliable, but naturally no filesystem is 
going to remove the need for replication and no replication system is 
going to remove the need for backups.


	



More information about the Info-cyrus mailing list