Implement Cyrus IMAPD in High Load Enviromment

Tue Sep 29 08:50:31 EDT 2009

> On Tue, Sep 29, 2009 at 09:45:53AM +0200, Simon Matter wrote:
>> What I'm really wondering, what filesystem disasters have others seen?
>> How
>> many times was it fsck only, how many times was it really broken. I'm
>> not
>> talking about laptop and desktop users but about production systems in a
>> production environment with production class hardware and operating
>> systems.
>
> Well - we lost three drives in a 3TB RAID6 partition within 24 hours.
> That was sad.  The third drive wasn't totally lost, just throwing enough
> errors that we remounted the whole thing readonly and kept it around to
> supplement the backup restores.
>
>> Would be really interesting to get some of the good and bad stories even
>> if not directly related to Cyrus-IMAP.
>
> Honestly, the biggest thing is - I've got a unit I've just switched drives
> in. It has 4 x 300GB 15kRPM drives in two RAID1 sets, and 8 x 2TB drives
> in two RAID5 sets.  That's 12TB of data space plus a bit of room for meta.
>
> Those 2TB drives spin at 7k2RPM, that's not that fast.  It takes weeks to
> fill one of those things, and weeks again to copy data off.
>
> Once you start talking multi-day downtimes to restore data, that's when
> your
> customers take their business elsewhere, and fair enough.  Ok if you're a
> university or business with a captive customer base, but not so nice if
> you're trying to keep customers!

The interesting point is that the discussion started as a ZFS vs.
$ANY_OTHER_FS thing but it quickly turns out that the filesystem is only
one part of the picture. If your storage fails on the block level I doubt
the filesystem matters that much.
One of the biggest issues is the cheap big drives which are put together
into huge RAID arrrays. There is a good chance that if one disk fails,
errors show up on another disk.
What I do with Linux software raid is to split every big disk into smaller
chunks like with a 500G disk I create 10x50G segments on it. The I create
independant RAID devs over every segment of each disk. The whole RAID
segments are then put into LVM volgroups. That prevents a disk from
getting kicked out of the RAID completely if only a small part of the disk
is defect. IIRC ZFS does something which in the end has similar effects
and also AIX SoftRAID does someting like that.

I end here before getting too OT.

Simon