Slow lmtpd
Rob Mueller
robm at fastmail.fm
Fri Mar 2 22:23:44 EST 2007
> see the output of the command "iostat -x sdb 5" (where 'sdb' is the device
> you have cyrus on) on your system. Even if you aren't saturating your
> Gigabit Ethernet link to the ATA-over-Ethernet storage, you may be
> exceeding the number I/O operations per second.
I thought ATA-over-Ethernet had high latency issues that mean lots of random
writes like cyrus does is problematic? Hmmm, I can't find a reference for
this at the moment, but I'm sure I read about it in a review somewhere.
The iostat command above will help determine if the IO is saturated or not,
the final column a "% utilisation" figure, from the man page.
%util - Percentage of CPU time during which I/O requests were issued to the
device (bandwidth utilization for the device). Device saturation occurs when
this value is close to 100%.
The other thing of interest would be the load on the machine, and processes
in D state.
So running:
uptime
And:
ps auxw | grep -v ' S'
A few times to see how many processes are in D state vs R state.
I've found IO debugging on linux quite hard. Finding CPU bottlenecks is easy
with things like oprofile, but I haven't yet been able to find anything that
tells me how long a process sits in a "blocked" state waiting on IO so I can
get an idea of which processes are generating the IO, and which partiular
parts of the code are generating it. Apparently 2.6.20 has better IO
accounting, but I haven't tested it yet:
http://kernelnewbies.org/Linux_2_6_20
IO Accounting
The present per-task IO accounting isn't very useful. It simply counts the
number of bytes passed into read() and write(). So if a process reads 1MB
from an already-cached file, it is accused of having performed 1MB of I/O,
which is 'wrong'. So this IO accounting implements per-process statistics of
"storage I/O" (i.e.: I/O that _really_ does I/O on the storage device -
Linux already had I/O storage statistics but it's not per-task). The data is
reported through taskstats and procfs (/proc/$PID/io)
So it sounds like you can find out which processes are the issue (imap or
lmtp and/or particular users maybe even with a bit of work), but still not
which parts of the code are causing the IOs.
Rob
More information about the Info-cyrus
mailing list