Slow lmtpd

Mon Mar 5 20:11:58 EST 2007

On Tue, 2007-03-06 at 09:13 +1100, Rob Mueller wrote:
> I've never seen over 100%, and it doesn't seem to make sense, so I'm 
> guessing it's a bogus value.

Yeah, I talked to the Coraid guys and they told me iostat reports
incorrect values for AoE.

> > avg-cpu:  %user   %nice %system %iowait   %idle
> >           2.53    0.00    5.26   89.98    2.23
> 
> However this shows that the system is mainly waiting on IO as we expected.

Yep, although I'd say it's a bit more than expected...

> Really you never want that many lmtpd processes, if they're all in use, it's 
> clear you've got an IO problem. Limiting it to 10 or so is probably a 
> reasonable number to avoid complete IO saturation and IO sevice delays.

The problem in limiting them to a lower value is that once the MTAs
start running their queues, their connections will start being refused,
since all lmtpd's will be in use, and the messages will go back to the
queue.

Maybe there is a number that will allow the system to react quickly
enough to avoid new connections being refused, but I tried with 50 and
it behaved as described above.

I had to reduce the default value of 
"lmtp_destination_concurrency_limit" in postfix to 10 (the default is
20), and change the value of "queue_run_delay" on some servers to avoid
having them all run their queues at the same time, because that ends up
causing the lmtpd process limit to be reached.

> 1. How many disks in the AoE array?
> 2. Are they all one RAID array, or multiple RAID arrays? What type?
> 3. Are they one volume, or multiple volumes?

There is only one RAID-10 array using 8 disks. The whole system is
installed on this array, although directories like /var/lib/imap
and /var/spool/imap are mounted on different LVM volumes.

> Because of the latency for system <-> drive IO, the thing you want to try 
> and do is allow the OS to send more outstanding requests in parallel. The 
> problem is I don't know where in the FS <-> RAID <-> AoE path the 
> serialising bits are, so I'm not sure what the best things to do to increase 
> parallelism are, but the usualy things to try are more RAID arrays with less 
> drives per array, and more volumes per RAID array. This gives more places 
> for parallelism to occur assuming there's not something holding some 
> internal lock somewhere.

The Coraid people suggested me a larger array, using 14 disks to
increase the throughput through the use of more striping elements. I can
try this for the next servers to go into production, but changing the
current one will be harder.

Thanks a lot,
Andre