slow mailbox creation

Lawrence Greenfield leg+ at andrew.cmu.edu
Fri Feb 21 16:04:45 EST 2003


   Date: Fri, 21 Feb 2003 12:30:21 -0500 (EST)
   From: David A Powicki <dpowicki at oit.umass.edu>
[...]
   This computer seems happy and fast in all respects, except for
   mailbox creations (4 seconds) and deletions (about 2 seconds) and
   ACL updates (2 seconds).  A truss of an imapd that creates a
   mailbox and then deletes it reveals 75 iterations of this:


Are these the only system calls that are getting iterated?

** If you were seeing:

   fcntl(5, F_SETLKW, 0xFFBEE870) = 0
   fstat(5, 0xFFBEE998) = 0
   stat("/var/cyrus/imap/mailboxes.db", 0xFFBED3D0) = 0 
   fcntl(5, F_SETLKW, 0xFFBED448) = 0
   open(...)
   dup(...)

I'd diagnose as following:

This iteration is unusual. What is happening is that the process is
getting an exclusive lock on mailboxes.db and then making sure it has
the latest copy of the file. (It compares the inode of the file
descriptor is has locked with the inode of the "mailboxes.db" file.)
If it is iterating these system calls, it's discovering that some
other process has replaced mailboxes.db

** The next question is: what's actually taking the time?

Use "truss -D" instead of just truss. This will get you times on the
fsync()s, which may be taking substantial fractions of a second. We've
noticed a bug with some Solaris setups where fsync() times gradually
climb until they're untolerable. Remounting the filesystem without
logging and then with logging again seems to clear this bogus
behavior.

In fact, on our frontend systems (which are the ones that are
suspectible to this) we have a cronjob that runs the following once a
day:

---
#!/bin/sh

/usr/sbin/mount -o remount,noatime /
/usr/sbin/mount -o remount,noatime,logging /
---

Our backend systems use vxfs and don't seem to suffer from this
problem.

fsyncs should take fractions of a second, not multiple seconds.

Another possibility (we haven't tested this in production) is to
change "use_osync" from 0 to 1 in cyrusdb_skiplist.c. Some benchmarks
I've done show this to yield better performance on Solaris (but not
Linux).

Larry





More information about the Info-cyrus mailing list