Cyrus 2.1.9 on Solaris 8 - occasional deadlocks on individual users

Steve Barber steveb at cme.nist.gov
Tue Sep 10 12:46:43 EDT 2002


We're running Cyrus 2.1.9 on Solaris 8 on a fairly high load production
machine and we're having occasional problems with deadlocks on individual
users.

I'm not certain exactly what we're deadlocking on (is there a good way
to find out for sure?) but have been assuming that it's the user's
.seen file.  Maybe it's an index file?  It seems to affect lmtpd
and imapd at least.  We notice it through lmtpd hanging (see below);
we don't know at this point what (if anything) the user is seeing on
the imap side.  There may be a correlation with dialup users here;
it could be possible that this is happening when a user's session gets
uncleanly disconnected, but that's speculation right now.

Truss on the users' imapds shows this :

fcntl(7, F_SETLKW, 0xFFBEF3B4)  (sleeping...)

Killing off the user's imapd processes (there's always been more than
one) frees things up again.

Here's the version information:

name       : Cyrus IMAPD
version    : v2.1.9 2002/08/30 18:40:23
vendor     : Project Cyrus
support-url: http://asg.web.cmu.edu/cyrus
os         : SunOS
os-version : 5.8
environment: Cyrus SASL 2.1.7
             Sleepycat Software: Berkeley DB 4.0.14: (November 18, 2001)
             CMU Sieve 2.2
             TCP Wrappers
             mmap = shared
             lock = fcntl
             nonblock = fcntl
             auth = unix
             idle = idled
             dirhash = full
             mboxlist.db = skiplist
             subs.db = flat
             seen.db = skiplist
             duplicate.db = db3-nosync
             tls.db = db3-nosync

We're also using the master pid-tracking patch that's in bugzilla,
although that doesn't appear to be an issue here.

This is causing big problems for us because sendmail times out
delivering the message, marks localhost as not responding in the
hoststatus database, and then nobody gets email until we find
the processes at fault and kill them.  Turning hoststatus
off isn't a cure although it might minimize the problem.

In the mean time, any suggestions for how to narrow the problem down?

Steve




More information about the Info-cyrus mailing list