Load spikes when new email arrives

Wed Jan 23 14:11:08 EST 2013

On Wed, Jan 23, 2013 at 1:55 PM, Andrew Morgan <morgan at orst.edu> wrote:

> On Wed, 23 Jan 2013, francis picabia wrote:
>
>  Here are more stats.  Do these look average for performance?
>> It is difficult to understand why the system was working with few
>> load spikes before.
>>
>> A mailman mailing list sends 10kbyte message to 4000
>> users having accounts on this cyrus system.  If I
>> grep "Delivered" in the maillog by the minute I can
>> see how fast the messages are stored.
>>
>> e.g.:
>> # grep Delivered /var/log/maillog | grep 'Jan 23 10:37' | wc -l
>>    696
>>
>> That is the best.  This peak event pushed the load to 14
>> for 12 minutes, where it averages 604 messages
>> delivered to cyrus mailboxes per minute.  Is that
>> reasonable for  maximum delivery rate?
>>
>> I've also backed out the change (yesterday) to
>> /sys/block/sda/queue/nr_**requests
>> I think it was pushing the load higher and there is no advantage
>> in my hardware (SAS with Perc 5/i Raid 5 over 4 disk)
>> to run with a low value for nr_requests.
>>
>
> You can certainly achieve higher delivery rates, but that all depends on
> your underlying hardware and how you have partitioned your system.
>
> Why don't you start running "iostat -x 5" on the system?  Leave this
> running to give you an idea of the baseline behavior and then look at it
> during periods of high load.  I suspect you will see that your svctm and
> %util will go up dramatically when a large number of messages are being
> delivered.  But, let's not make decisions based on assumptions!  :)
>
> On my Cyrus Murder frontends (3 of them), I have limited LMTP connections
> to 25 in cyrus.conf:
>
>   lmtp          cmd="/usr/local/cyrus/bin/**lmtpproxyd" listen="lmtp"
> proto="tcp4" prefork=0 maxchild=25
>
> This prevents our mail relays (Postfix) from opening too many simultaneous
> LMTP connections, which can cause too much I/O contention.  Take a look
> during your periods of high load to see how many lmtpd processes are
> running.  You may want to limit the number.
>
>         Andy
>

Thanks for the response.  I have been checking my iostat whenever there is
a number of messages in the active queue.

Here is a sample snapshot from a script I run (ignoring the first
iostat output of averages):

Active in queue: 193
 12:47:01 up 5 days,  5:23,  6 users,  load average: 14.11, 9.22, 4.67

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda5              3.25   281.00 19.75 129.50   654.00  3384.00    27.06
5.53   36.24   6.69  99.80

svctm is about the same as when not under load and it went above 7 only
once.
Then there is this comment about the validity of tracking svctm:
http://www.xaprb.com/blog/2010/09/06/beware-of-svctm-in-linuxs-iostat/

%util is often reaching close to %100 when there is a queue to process.

sda5 is where the cyrus mail/imap lives.  Our account names all begin with
numbers, so almost all mail accounts are under the q folder.

I'll check the lmtp process numbers as well.  I've put in some
in_flow_delay on the
postfix side so it might keep the load from peaking as sharply.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20130123/cafacbb0/attachment.html