rooling replication and many pop3d
Andy Bennett
andyjpb at ashurst.eu.org
Wed Nov 23 20:07:40 EST 2011
Hi,
>>> I enabled replication between two servers with version 2.4.10 cyrus.
>>> I set the option for the rolling replication, and it works fine but
>>> obviously I have a high CPU load.
>>> Unfortunately after 10 minutes of running processes pop3d increasing
>>> from 50 to over 200, making the server unusable for customers.
>>> Can you tell me why this increase is abnormal?
>> Can you use something like 'top' to work out which processes are
>> consuming most of the CPU time?
Thanks for the screenshots.
> This screenshot of top before :
>
> http://www.digicolor.net/cyrus/img1.jpg
This shows a load average of around 1. That means that, at any given
point in time over the past 1 and 5 minutes, 1 process has been waiting
in the run queue, ready to go.
This therefore not an entirely idle machine. I see you're running a
nameserver and a few other things: it looks they've been busy on the CPU
but not excessively so.
What's more worrying is the 4.5% of CPU time spent "waiting". This time
is accrued when processes are unable to run due to outstanding IO.
> and after 20 minutes of rooling replication :
>
> http://www.digicolor.net/cyrus/img5.jpg
This shows a load average of around 7 and 30% of CPU time spent in
iowait. This machine does not seem to be managing well with the IO load
of rolling replication.
>> Can you use something like 'vmstat 1' to show us how much I/O there is
>> on the system?
>
> This screenshot of top before :
>
> http://www.digicolor.net/cyrus/img3.jpg
This shows a system that is not reading anything from disk (bi). A small
number of blocks are being written out to disk (bo). Each line
represents activity for a period of 1 second, as specified by the
parameter to 'vmstat'.
> and after 20 minutes of rooling replication :
>
> http://www.digicolor.net/cyrus/img6.jpg
This system is writing to disk but it's very choppy. Sometimes it's
getting 7,000 blocks out per second and other times it's only 1,000.
Depending on your block size, this probably represents only a few MB per
second. The last column shows iowait CPU percentage.. and it's rather large.
What IO subsystem do you have on this machine? What filesystem are you
using?
The IO on this machine appears to be struggling significantly.
I did a quick test on my laptop. I have a 2.5", 7,200rpm 200GB disk.
I ran this in my home directory to cause every file to be read from disk:
-----
$ find -type f | xargs cat > /dev/null
-----
'vmstat 1' gives lines like this:
-----
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
1 1 1680 30600 272 2575688 0 0 12376 576 2141 3424 3 6
47 43
1 1 1680 32396 272 2576736 0 0 29988 0 1882 3632 5 7
47 41
0 1 1680 33868 272 2576660 0 0 46820 0 2304 4443 4 8
48 39
1 0 1680 33416 272 2578600 0 0 36716 0 2067 3733 3 7
48 42
0 1 1680 34000 272 2581944 0 0 50432 0 1164 2983 3 6
50 42
0 1 1680 31876 272 2585320 0 0 46464 64 1223 2964 3 8
49 40
1 1 1680 30288 272 2588672 0 0 51712 0 1380 3658 3 7
46 43
0 1 1680 29836 272 2590552 0 0 59776 0 1288 3549 4 7
47 42
0 1 1680 30324 272 2592948 0 0 58368 0 1287 3568 2 7
49 41
1 1 1680 30308 272 2593108 0 0 12800 18 917 1673 2 2
49 46
-----
They're an order of magnitude greater than what you're seeing. As you
can see, I drop a few bi when I start to do bo but that's because I've
only got a single spindle.
Please can you run the same test?
Can you track the source of all those writes in img3?
Please can you tell us more about the type of machine you are trying to
run this on?
Thanks for the info and screenshots so far.
>> Are most of the pop3d processes sleeping in iowait?
>> Do you use any other servers such as the impad?
> Yes I have imapd
>
> This is screenshot of pstree before :
>
> http://www.digicolor.net/cyrus/img2.jpg
>
>
> This is screenshot of pstree after :
>
> http://www.digicolor.net/cyrus/img4.jpg
Regards,
@ndy
--
andyjpb at ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
More information about the Info-cyrus
mailing list