More timings populating a mailbox

Ross Boylan ross at biostat.ucsf.edu
Mon Dec 25 02:33:50 EST 2006


Here are more results, playing with various filesystems and options
delivering messages to Cyrus 2.2 via UW-IMAP mailutil.  This uses
IMAP, not LMTP, to insert the messages into the mailbox.
Linux 2.6.18 Kernel, SATA 7200 RPM drive.  This is not server-class
hardware, though it's not bad.

For the impatient, best performance came from parallelizing inserts
and using ext3 with noatime and a library that disabled fsync.  One
would not want to disable fsync outside of a migration scenario.

BASIC DATA

This is the same setup as in my earlier message, but this time with
more varieties.  Performance measure is messages/second, delivering
540 messages (average of 3 trials).  I report results for a single
insert and 20 threads inserting simultaneously (the 20 threads read
from the same mbox file but inserted into different IMAP boxes, all
for the same user).  The single insert is in the column headed "1";
the parallel insert is in the column headed "20".  First the results
for the vanilla cases: 

  1               20
24.7  ext3      185.1
18.3  reiser3.6 171.0 (though also 160 and 142 in other scenarios)
25.4  xfs       223.3
25.4  jfs       114.1

ext3 was the only one that clearly responded to mount options
25.2 ext3      267.3  noatime

the fakesync library that disabled fsync did this
26.5 ext3      393.2  noatime + fakesync
     ext3      300.0  noatime,data=journal + fakesync  
     reiser    232.1  noatime + fakesync
noatime alone didn't help reiser, consistent with the mount man page's
indication that it might not.

I tried XFS noatime,osyncisdsync.  The first time I got the same
results as before.  I changed the options with mount -o remount.
Suspecting this hadn't caused the options to take effect, I manually
umount'd and then mount'd.  Immediately my tests began failing with
I/O errors; there was a single test mailbox created, and I could not
delete it through cyradm.  Nothing I did produced recovery.  Whether
this indicates an error on my part, danger in using osyncisdsync, or
some combination I don't know.

Changing ext3 directory indexing had little effect on performance
(the test created few directories, though c. 2,000 files in each of
those directories).

TEST COMMENTARY (i.e., stuff not in the tables)

The tests were only semi-controlled.  They were on a test system that
wasn't doing much else.  Aside from the fact that housekeeping jobs
could interefere and the fact that I did the tests manually, I didn't
always reset things to a clean slate.  For example, even when I
deleted the test mailboxes, other stuff may have been building up
(e.g., in cyrus's internal files, in the filesystem tree).

Several different filesystems seemed to exhibit deteriorating
performance as they filled, though this wasn't totally repeatable.

The advantages of fakesync also seemed to decline with use, or perhaps
with repeated writes in the same short period.

I recall seeing advice to use data journaling to improve performance
(I think the logic was that if all the writes went to one spot in the
disk--the journal--they would go faster).  It didn't help here.
Possibly fakesync delivers any benefit data journalling would produce;
possibly I've misunderstood how to apply it; possibly it doesn't help.

RESULTS COMMENTARY (i.e., what does it mean?)

The relatively slow performance of a single thread suggests that
delays in the TCP dialogues may be a signficant factor.  I was not
ambitious enough to implement the Linux socket options that might tune
this, as suggested in earlier threads.  The disk options did not
significantly affect single-thread performance, suggesting the disk is
not the bottleneck in that case.  Note prior reports with FreeBSD that
tuning TCP parameters there produced big performance gains.

Given my ignorance of tuning cyrus, filesystems, and TCP, you should
only take these results as a straw in the wind.

The best/worst performance ratio is pretty large: 393/19 = 20:1.
Parallelizing was the single biggest winner; while the speed increase
is not linear in the number of threads, it's quite possible > 20
threads would produce more throughput.  Here are some timings from
Reiser (in a slightly different setting than for the numbers above):
Threads  Mess/sec
 1        18.6
 2        32.3
 3        51.2
 7        75.8
20       160.5

Merry Christmas, if you celebrate it!

More info on the fakesync library appears below.

Ross Boylan

On Mon, Sep 25, 2006 at 09:45:57AM -0700, Wil Cooley wrote:
> On Sun, 2006-09-24 at 23:17 -0700, Ross Boylan wrote:
> 
> > First, is this performance to be expected, or might there be something
> > here I can improve?  I have quite a bit of mail I'd like to migrate,
> > so if there's an easy way to speed this up I'd like to do so.
> > 
> > Second, where should I look to diagnose or solve this problem?
> 
> Depending on the number of messages, it could be the constant fsync()
> that slows it down.  Try my fakesync library with LD_PRELOAD and perform
> a test migration; I'd like to know if it makes a difference, because I
> haven't had chance to test it much.  Obviously, you really only want to
> run this during testing or migration, because it would make normal
> operation considerably less reliable.
> 
> http://haus.nakedape.cc/svn/public/trunk/small-projects/fakesync/
> 
> Wil


More information about the Info-cyrus mailing list