choosing a file system
Rob Mueller
robm at fastmail.fm
Fri Jan 2 00:19:52 EST 2009
> There are /lots/ of (comparative) tests done: The most recent I could
> find with a quick Google is here:
>
> http://www.phoronix.com/scan.php?page=article&item=ext4_benchmarks
Almost every filesystem benchmark I've ever seen is effectively useless for
comparing what's best for a cyrus mail server. They try and show the
maximums/minimums of a bunch of discrete operation types (eg streaming IO,
creating files, deleting files, lots of small random reads, etc) running on
near empty volumes.
What none of them show is what happens to a filesystem when it's a real
world cyrus mail spool/index:
* 100,000's of directories
* 10,000,000's of files
* 1-1,000,000 files per directory
* files continuously being created and deleted (emails)
* data being appended to existing files (cyrus.* files)
* lots of fsync calls all over the place (every lmtp append has multiple
fsyncs, as well as various imap actions)
* run over the course of multiple years of continuous operations
* with a filesystem that's 60-90% full depending on your usage levels
There's serious fragmentation issues going on here that no benchmark even
comes close to simulating.
Now from our experience, I can tell you that ext3 really does poorly on this
workload compared to reiserfs. We had two exact same servers, one all
reiserfs and one all ext3. The ext3 one started out ok, but over the course
of a few weeks/months, it started getting worse and worse and was eventually
being completely crushed by IO load. The machine running reiserfs had no
problems at all even though it had more users on it as well and was growing
at the same rate as the other machine.
Yes we did have directory indexing enabled (we had it turned on from the
start), and we tried different data modes like data=writeback and
data=ordered but that didn't help either.
To be honest, I don't know why exactly, and working out what's causing IO
bottlenecks is not easy. We just went back to reiserfs.
Some previous comments I've made.
http://www.irbs.net/internet/info-cyrus/0412/0042.html
http://lists.andrew.cmu.edu/pipermail/info-cyrus/2006-October/024119.html
> The problem with reiserfs is... well. The developers have explicitely
> stated that the development of v3 has come to its end, and there was the
In this particular case, I'm really almost happy with this! Reiserfs has
been very stable for us for at least 5 years, and I'm almost glad no-one is
touching it because invariably people working on something will introduce
new weird edge case bugs. This was a while back, but it demonstrates how
apparently just adding 'some "sparse" endian annotations' caused a bug.
http://oss.sgi.com/projects/xfs/faq.html#dir2
That one was really nasty, even the xfs_repair tool couldn't fix it for a
while!
Having said that, there have been some bugs over the last few years with
reiserfs, however the kernel developers will still help with bug fixes if
you find them and can trace them down.
http://blog.fastmail.fm/2007/09/21/reiserfs-bugs-32-bit-vs-64-bit-kernels-cache-vs-inode-memory/
http://lkml.org/lkml/2005/7/12/396
http://lkml.org/lkml/2008/6/17/9
Rob
More information about the Info-cyrus
mailing list