High Load Avg and Context Switches
Jeremy Sanders
jsanders at teklinks.com
Mon Mar 17 23:21:52 EST 2003
Why would pop3d have all the hot backup db files open? If 50 pop3d
processes are trying to access the same 4 files, I can see why there
would be some contention there....
pop3d 1423 cyrus mem REG 58,9 270336 324606
/var/spool/imap/db/__db.002
pop3d 1423 cyrus mem REG 58,9 98304 324607
/var/spool/imap/db/__db.003
pop3d 1423 cyrus mem REG 58,9 17063936 324642
/var/spool/imap/db/__db.004
pop3d 1423 cyrus mem REG 58,9 32768 324827
/var/spool/imap/db/__db.005
pop3d 1423 cyrus mem REG 72,2 45415 3597474
/lib/libnss_files-2.2.5.so
pop3d 1423 cyrus mem REG 72,2 46117 3597482
/lib/libnss_nisplus-2.2.5.so
pop3d 1423 cyrus mem REG 72,2 1402035 1733410
/lib/i686/libc-2.2.5.so
pop3d 1423 cyrus 5u REG 58,9 7703932 324844
/var/spool/imap/db/log.0000000022
pop3d 1423 cyrus 6r REG 58,9 7703932 324844
/var/spool/imap/db/log.0000000022
Thanks,
Jeremy
>>> "Jeremy Sanders" <jsanders at teklinks.com> 03/17/03 04:15PM >>>
Hello,
I'm running Cyrus IMAP 2.1.12 on a Redhat 7.3 box running kernel
2.4.20. The imap partition is on a Compaq RA4100 with a Compaq
Fiber-channel HBA in the server. It is an lvm ext3 partition running
w/
noatime,data=ordered.
Here is a vmstat output, notice the spike in context switches:
[root at mailserv2 root]# vmstat -n 2
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs
us
sy id
0 36 3 512 1045092 101396 194348 0 0 92 589 334 542
9 28 64
2 32 2 512 1040448 101408 194504 0 0 54 802 621 1393
61 12 27
0 36 2 512 1040500 101408 194492 0 0 0 0 152 420
34 2 64
0 40 2 512 1040412 101428 194516 0 0 0 124 235 197
1 1 98
0 10 1 512 1042044 101484 194488 0 0 68 778 388 417
7 4 89
4 6 1 512 1042312 101548 194500 0 0 0 778 794 245291
3 72 25
2 10 3 512 1042328 101564 194508 0 0 0 154 299 303798
17 83 0
2 14 3 512 1042320 101564 194512 0 0 0 0 178 378398
1 99 0
2 18 3 512 1041972 101572 194520 0 0 0 104 190 358011
0 100 0
2 15 2 512 1038820 101604 195456 0 0 42 1804 434 58432
25 33 42
0 20 1 512 1038480 101632 195452 0 0 0 798 194 197
0 2 98
0 30 1 512 1038476 101632 195452 0 0 0 0 152 229
0 1 99
0 56 1 512 1037040 101668 195588 0 0 64 512 330 926
14 12 74
2 56 1 512 1036748 101668 195580 0 0 0 240 194 573
10 8 82
2 7 1 512 1035512 101796 195648 0 0 60 1858 1245 1424
13 12 75
0 13 1 512 1035460 101816 195648 0 0 0 198 161 252
18 1 81
0 14 1 512 1035460 101816 195648 0 0 0 0 129 75
0 0 99
0 15 2 512 1035444 101832 195648 0 0 0 32 142 88
1 0 99
0 0 0 512 1037516 101940 195024 0 0 6 726 534 680
3 3 94
0 0 0 512 1037712 102012 194932 0 0 0 614 569 968
4 4 92
0 0 0 512 1037664 102056 194916 0 0 8 600 335 511
1 2 97
It's IO read activity is also high as would be expected from server
that is being popped by 1,000 Outlook clients continuously. The
loadavg
ranges from 11-48. If it's around 11-15 the server runs fine. If it
gets
over 20, the server is noticably slower. Up until last week the
loadavg
was consistently below 5. The processor is mostly idle. I thought the
ext3 partition might be corrupt, so I made a new lvm partition and
cpio'ed the data over to the new filesystem. That didn't help either.
We've also adjusted elvtune parameters in both directions without any
appreciable difference. The only change that has had a positive impact
so far was changing data=journaled to data=ordered.
We also pruned the delivery database and increased the
/proc/sys/fs/file-max parameter.
Any help would be appreciated.
Thanks,
Jeremy Sanders, CCNP RHCE CNE
Senior System Engineer
Teklinks, Inc.
205-249-5988
More information about the Info-cyrus
mailing list