Miserable performance of cyrus-imapd 2.3.9 -- seems to be locking issues
Nik Conwell
nik at bu.edu
Fri Feb 29 06:39:07 EST 2008
On Feb 28, 2008, at 4:38 PM, Jeff Fookson wrote:
> is about 200GB. There are typically about 200 'imapd'
> processes at a given time and a hugely varying number of
> 'lmtpds' (from
> about 6 to many hundreds during
> times of greatest pathology). System load is correspondingly in the
> 2-15
> range, but can spike to 50-70!
Typically when deadlocks free you get load spikes as work can now
progress. It implies one thing was holding the lock for a long time -
that thing itself probably being impeded by something else. If there
was high activity of many things hitting the lock, you wouldn't expect
to see spikes - the system might even look idle as everything is just
waiting for the lock.
> waits of upwards of 1-2 minutes to get a write lock as shown by the
> example below (this is from a trace of an 'lmtpd')
>
> [strace -f -p 9817 -T]
> 9817 fcntl(10, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0,
> len=0}) = 0 <84.998159>
[...]
> Can anyone suggest what we might do next to debug the problem further?
Good job with the strace. Now figure out what fd 10 is, either by
lsof or earlier in the strace output (look for "= 10" and that should
show what opened it).
Then install lslk and figure out who is holding the lock on that file
and for how long, etc. Then look at that process to see what it's
doing for so long (strace again).
-nik
More information about the Info-cyrus
mailing list