RAID type suggestion
Simon Matter
simon.matter at invoca.ch
Thu Jul 12 05:28:42 EDT 2007
> Hello,
>
> We have a low-end platform of using Core2duo (Xeon) with 4G ram.
> The budget is not much enough to afford SAN.
> We consider SAS finally, but are still having concerns about mail access
> speed.
>
> This RAID could be:
> 1. RAID 1 (OS) + RAID 1 (mail spool)
> 2. RAID 1 (OS) + RAID 5 (mail spool)
> 3. RAID 50, for OS and mail
> 4. RAID 10, for OS and mail
>
> Option 1 is found difficult to expand its mail spool
> RAID 5 is not good for speed, but it can expand
> RAID 50 speed is much slower than RAID 10, but RAID 10 is very expensive
>
>
> Anyone here can provide suggestions?
Not easy questions, but I try to give you some ideas:
I suggest separating [OS + cyrus metadata] on RAID1 and cyrus mailspool on
RAID5.
As OP already noted, RAID5 _can_ be dangerous. The most common problems
are sleeping bad sectors. Once such a sector on one disk is touched, the
whole disks may be defunct and your whole array is degraded. Then after
replacing the disk, while resyncing, bad sectors on another disk may be
detected and now your RAID has -1 disk. There are some solutions to this
problem:
1) Background reading of all RAID member disks can prevent from sleeping
bad blocks since they are detected as soon as they get bad.
2) You don't create one large RAID5 with your disks but instead create a
number of RAID5 slices which, if there is a bad sector, only the segment
in question get removed (such tricks are usually only possible with
software RAID or they are built in features in a SAN, normal hardware RAID
controllers don't have such features).
3) You could use RAID6 to not depend on a single disk.
About speed: Most people suggest more spindles means more speed. But, you
should always consider what exactly you need. For a cyrus mailspool, you
don't need very high sequential throughput, but you need good random
access times and the random access should be independant from each other.
In a normal RAID5, say with 8 disks, for every single small chunk of data,
[all disks -1] have to access the location on the disk in question to get
you the data. RAID50 is already much better because you get different
groups of independant disks which means there is a good change that
different randmon accesses can be executed on those disks in parallel.
Simon
More information about the Info-cyrus
mailing list