Miserable performance of cyrus-imapd 2.3.9 -- seems to be locking issues

Nik Conwell nik at bu.edu
Fri Feb 29 06:39:07 EST 2008


On Feb 28, 2008, at 4:38 PM, Jeff Fookson wrote:

> is about 200GB.  There are typically about 200  'imapd'
> processes at a given time and a hugely varying number of  
> 'lmtpds' (from
> about 6 to many hundreds during
> times of greatest pathology). System load is correspondingly in the  
> 2-15
> range, but can spike to 50-70!

Typically when deadlocks free you get load spikes as work can now  
progress.  It implies one thing was holding the lock for a long time -  
that thing itself probably being impeded by something else.  If there  
was high activity of many things hitting the lock, you wouldn't expect  
to see spikes - the system might even look idle as everything is just  
waiting for the lock.

> waits of  upwards of 1-2 minutes to get a write lock as shown by the
> example below (this is from a trace of an 'lmtpd')
>
> [strace -f -p 9817 -T]
> 9817  fcntl(10, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0,
> len=0}) = 0 <84.998159>
[...]
> Can anyone suggest what we might do next to debug the problem further?

Good job with the strace.  Now figure out what fd 10 is, either by  
lsof or earlier in the strace output (look for "= 10" and that should  
show what opened it).

Then install lslk and figure out who is holding the lock on that file  
and for how long, etc.  Then look at that process to see what it's  
doing for so long (strace again).

-nik



More information about the Info-cyrus mailing list