Miserable performance of cyrus-imapd 2.3.9 -- seems to be locking issues

Paul M Fleming pfleming at siumed.edu
Thu Feb 28 17:09:58 EST 2008


Limit the number of lmtpd daemons to around 10 -- that solved the issue 
for me.. We let sendmail handle the queuing. It is more than likely a 
locking issue..


Michael Bacon wrote:
> What database format are you using for the mailboxes database?  What kind 
> of storage is the "metapartition" (usually /var/imap) on?  What kind of 
> storage are your mail partitions on?
> 
> 
> --On Thursday, February 28, 2008 2:38 PM -0700 Jeff Fookson 
> <jfookson at as.arizona.edu> wrote:
> 
>> Folks-
>>
>> I am hoping to get some help and guidance as to why our installation of
>> cyrus-imapd 2.3.9
>> is unusably slow. Here are the specifics:
>>
>> The software is running on a 1.6GHz Opteron with 2Gb memory supporting a
>> user base of about 400
>> users. The average rate of arriving mail is on the order of 1-2
>> messages/sec. The active mailstore
>> is about 200GB.  There are typically about 200  'imapd'
>> processes at a given time and a hugely varying number of 'lmtpds' (from
>> about 6 to many hundreds during
>> times of greatest pathology). System load is correspondingly in the 2-15
>> range, but can spike to 50-70!
>>
>> Our users complain that the system is extremely sluggish during the day
>> when the system is most busy.
>>
>> The most obvious thing we observe is that both the lmtpds and the imapds
>> are spending HUGE times waiting
>> on locks. Even when the system load is only 1-2, an 'strace' attached to
>> an instance of lmtpd or imapd shows
>> waits of  upwards of 1-2 minutes to get a write lock as shown by the
>> example below (this is from a trace of an 'lmtpd')
>>
>> [strace -f -p 9817 -T]
>> 9817  fcntl(10, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0,
>> len=0}) = 0 <84.998159>
>>
>> We strongly suspect that these large times waiting on locks is what is
>> causing the slowness our users are reporting.
>>
>> We are under the impression that a single instance of cyrus-imapd scales
>> well up to about 1000 users (with about 1MB active
>> memory per 'imapd' process),  and so we are baffled as to what might be
>> going on.
>>
>> A non-standard aspect of our installation which may have something to do
>> with the problem is that we are
>> running cyrus on an lvm2 partition that itself is running on top of
>> drbd. Thinking that the remote writes
>> to the drbd secondary might be causing delays, we put the primary in
>> stand-alone mode so that the drbd layer
>> was not doing any network activity (the drbd link is running at gigabit
>> speed on its own crossover cable to
>> the secondary box) and saw no significant change in behavior. Any issues
>> due to locking and the lvm2 layer
>> would, of course, still be present even with drbd's activity reduced to
>> just local writes.
>>
>> Can anyone suggest what we might do next to debug the problem further?
>> Needless to say, our users get
>> extremely unhappy when trivial operations in their mail clients take
>> over a minute to complete.
>>
>> Thank you for any thoughts or advice.
>>
>> Jeff Fookson
>>
>> --
>> Jeffrey E. Fookson, PhD			Phone: (520) 621 3091
>> Support Systems Analyst, Principal	jfookson at as.arizona.edu
>> Steward Observatory
>> University of Arizona
>>
>> ----
>> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
>> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
>> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
> 
> 
> 
> 
> ----
> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


More information about the Info-cyrus mailing list