Miserable performance of cyrus-imapd 2.3.9 -- seems to be locking issues
Kenneth Marshall
ktm at rice.edu
Thu Feb 28 17:56:18 EST 2008
It may be that the software RAID 5 is your problem. Without the
use of NVRAM for a cache, all of the writes need all 3 disks.
That will cause quite a bottle-neck.
Ken
On Thu, Feb 28, 2008 at 03:36:43PM -0700, Jeff Fookson wrote:
> Michael Bacon wrote:
>
> > What database format are you using for the mailboxes database? What
> > kind of storage is the "metapartition" (usually /var/imap) on? What
> > kind of storage are your mail partitions on?
>
> Databases are all skiplist. Our mail partition and the metapartition are
> both on the same filesystem, as we intended that both be part of the
> same drbd mirror. That partition is
> a linux software RAID 5 (3 SATA disks). On top of the md layer is the
> drbd device; on top of that is an lvm2 logical volume; on top of that is
> an ext3 filesystem, mounted
> as '/var/imap'. The mail is then in /var/imap/mail and the metadata in
> /var/imap/config (and we also have /var/imap/certs for the ssl stuff,
> and /var/imap/sieve for sieve scripts).
>
> Thanks.
>
> Jeff Fookson
>
> >
> >
> > --On Thursday, February 28, 2008 2:38 PM -0700 Jeff Fookson
> > <jfookson at as.arizona.edu> wrote:
> >
> >> Folks-
> >>
> >> I am hoping to get some help and guidance as to why our installation of
> >> cyrus-imapd 2.3.9
> >> is unusably slow. Here are the specifics:
> >>
> >> The software is running on a 1.6GHz Opteron with 2Gb memory supporting a
> >> user base of about 400
> >> users. The average rate of arriving mail is on the order of 1-2
> >> messages/sec. The active mailstore
> >> is about 200GB. There are typically about 200 'imapd'
> >> processes at a given time and a hugely varying number of 'lmtpds' (from
> >> about 6 to many hundreds during
> >> times of greatest pathology). System load is correspondingly in the 2-15
> >> range, but can spike to 50-70!
> >>
> >> Our users complain that the system is extremely sluggish during the day
> >> when the system is most busy.
> >>
> >> The most obvious thing we observe is that both the lmtpds and the imapds
> >> are spending HUGE times waiting
> >> on locks. Even when the system load is only 1-2, an 'strace' attached to
> >> an instance of lmtpd or imapd shows
> >> waits of upwards of 1-2 minutes to get a write lock as shown by the
> >> example below (this is from a trace of an 'lmtpd')
> >>
> >> [strace -f -p 9817 -T]
> >> 9817 fcntl(10, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0,
> >> len=0}) = 0 <84.998159>
> >>
> >> We strongly suspect that these large times waiting on locks is what is
> >> causing the slowness our users are reporting.
> >>
> >> We are under the impression that a single instance of cyrus-imapd scales
> >> well up to about 1000 users (with about 1MB active
> >> memory per 'imapd' process), and so we are baffled as to what might be
> >> going on.
> >>
> >> A non-standard aspect of our installation which may have something to do
> >> with the problem is that we are
> >> running cyrus on an lvm2 partition that itself is running on top of
> >> drbd. Thinking that the remote writes
> >> to the drbd secondary might be causing delays, we put the primary in
> >> stand-alone mode so that the drbd layer
> >> was not doing any network activity (the drbd link is running at gigabit
> >> speed on its own crossover cable to
> >> the secondary box) and saw no significant change in behavior. Any issues
> >> due to locking and the lvm2 layer
> >> would, of course, still be present even with drbd's activity reduced to
> >> just local writes.
> >>
> >> Can anyone suggest what we might do next to debug the problem further?
> >> Needless to say, our users get
> >> extremely unhappy when trivial operations in their mail clients take
> >> over a minute to complete.
> >>
> >> Thank you for any thoughts or advice.
> >>
> >> Jeff Fookson
> >>
> >> --
> >> Jeffrey E. Fookson, PhD Phone: (520) 621 3091
> >> Support Systems Analyst, Principal jfookson at as.arizona.edu
> >> Steward Observatory
> >> University of Arizona
> >>
> >> ----
> >> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
> >> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
> >> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
> >
> >
> >
> >
> >
>
>
> --
> Jeffrey E. Fookson, PhD Phone: (520) 621 3091
> Support Systems Analyst, Principal jfookson at as.arizona.edu
> Steward Observatory
> University of Arizona
>
> ----
> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
>
More information about the Info-cyrus
mailing list