Thanks for everybody. That was an interesting thread. Nobody seems to use a NetApp appliance, may be due to NFS architecture problems.<br><br>I believe I'll look to ext4 that seemed to be available in last kernel, and also to Solaris, but we are not enough to support another OS.<br>
<br>Dom<br><br>And Happy New Year !<br><br><div class="gmail_quote">2008/12/31 Bron Gondwana <span dir="ltr"><<a href="mailto:brong@fastmail.fm">brong@fastmail.fm</a>></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d">On Tue, Dec 30, 2008 at 02:43:14PM -0700, Shawn Nock wrote:<br>
> Bron and the fastmail guys could tell you more about reiserfs... we've<br>
> used RH&SuSE/reiserfs/EMC for quite a while and we are very happy.<br>
<br>
</div>Yeah, sure could :)<br>
<br>
You can probably find plenty of stuff from me in the archives about our<br>
setup - the basic things are:<br>
<br>
* separate metadata on RAID1 10kRPM (or 15kRPM in the new boxes) drives.<br>
* data files on RAID5 big slow drives - data IO isn't a limiting factor<br>
* 300Gb "slots" with 15Gb associated meta drives, like this:<br>
<br>
/dev/sdb6 14016208 8080360 5935848 58% /mnt/meta6<br>
/dev/sdb7 14016208 8064848 5951360 58% /mnt/meta7<br>
/dev/sdb8 14016208 8498812 5517396 61% /mnt/meta8<br>
/dev/sdd2 292959500 248086796 44872704 85% /mnt/data6<br>
/dev/sdd3 292959500 242722420 50237080 83% /mnt/data7<br>
/dev/sdd4 292959500 248840432 44119068 85% /mnt/data8<br>
<br>
as you can see, that balances out pretty nicely. We also store<br>
per-user bayes databases on the associated meta drives.<br>
<br>
We balance our disk usage by moving users between stores when usage<br>
reaches 88% on any partition. We get emailed if it goes above 92%<br>
and paged if it goes above 95%.<br>
<br>
Replication. We have multiple "slots" on each server, and since<br>
they are all the same size, we have replication pairs spread pretty<br>
randomly around the hosts, so the failure of any one drive unit<br>
(SCSI attached SATA) or imap server doesn't significantly overload<br>
any one other machine. By using Cyrus replication rather than,<br>
say, DRBD, a filesystem corruption should only affect a single<br>
partition, which won't take so long to fsck.<br>
<br>
Moving users is easy - we run a sync_server on the Cyrus master, and<br>
just create a custom config directory with symlinks into the tree on<br>
the real server and a rewritten piece of mailboxes.db so we can<br>
rename them during the move if needed. It's all automatic.<br>
<br>
We also have a "CheckReplication" perl module that can be used to<br>
compare two ends to make sure everything is the same. It does full<br>
per-message flags checks, random sha1 integrity checks, etc.<br>
Does require a custom patch to expose the GUID (as DIGEST.SHA1)<br>
via IMAP.<br>
<br>
I lost an entire drive unit on the 26th. It stopped responding.<br>
8 x 1TB drives in it.<br>
<br>
I tried rebooting everything, then switched the affected stores over<br>
to their replicas. Total downtime for those users of about 15<br>
minutes because I tried the reboot first just in case (there's a<br>
chance that some messages were delivered and not yet replicated,<br>
so it's better not to bring up the replica uncleanly until you're<br>
sure there's no other choice)<br>
<br>
In the end I decided that it wasn't recoverable quickly enough to<br>
be viable, so chose new replica pairs for the slots that had been<br>
on that drive unit (we keep some empty space on our machines for<br>
just this eventuality) and started up another handy little script<br>
"sync_all_users" which runs sync_client -u for every user, then<br>
starts the rolling sync_client again at the end. It took about<br>
16 hours to bring everything back to fully replicated again.<br>
<font color="#888888"><br>
Bron.<br>
</font></blockquote></div><br><br clear="all"><br>-- <br>Dominique LALOT<br>Ingénieur Systèmes et Réseaux<br><a href="http://annuaire.univmed.fr/showuser?uid=lalot">http://annuaire.univmed.fr/showuser?uid=lalot</a><br>