Slow lmtpd

Rob Mueller robm at fastmail.fm
Fri Mar 2 22:23:44 EST 2007


> see the output of the command "iostat -x sdb 5" (where 'sdb' is the device
> you have cyrus on) on your system.  Even if you aren't saturating your
> Gigabit Ethernet link to the ATA-over-Ethernet storage, you may be
> exceeding the number I/O operations per second.

I thought ATA-over-Ethernet had high latency issues that mean lots of random 
writes like cyrus does is problematic? Hmmm, I can't find a reference for 
this at the moment, but I'm sure I read about it in a review somewhere.

The iostat command above will help determine if the IO is saturated or not, 
the final column a "% utilisation" figure, from the man page.

%util - Percentage of CPU time during which I/O requests were issued to the 
device (bandwidth utilization for the device). Device saturation occurs when 
this value is close to 100%.

The other thing of interest would be the load on the machine, and processes 
in D state.

So running:

uptime

And:

ps auxw | grep -v ' S'

A few times to see how many processes are in D state vs R state.

I've found IO debugging on linux quite hard. Finding CPU bottlenecks is easy 
with things like oprofile, but I haven't yet been able to find anything that 
tells me how long a process sits in a "blocked" state waiting on IO so I can 
get an idea of which processes are generating the IO, and which partiular 
parts of the code are generating it. Apparently 2.6.20 has better IO 
accounting, but I haven't tested it yet:

http://kernelnewbies.org/Linux_2_6_20

IO Accounting

The present per-task IO accounting isn't very useful. It simply counts the 
number of bytes passed into read() and write(). So if a process reads 1MB 
from an already-cached file, it is accused of having performed 1MB of I/O, 
which is 'wrong'. So this IO accounting implements per-process statistics of 
"storage I/O" (i.e.: I/O that _really_ does I/O on the storage device - 
Linux already had I/O storage statistics but it's not per-task). The data is 
reported through taskstats and procfs (/proc/$PID/io)

So it sounds like you can find out which processes are the issue (imap or 
lmtp and/or particular users maybe even with a bit of work), but still not 
which parts of the code are causing the IOs.

Rob



More information about the Info-cyrus mailing list