Bug or feature: (Too) Many imapd processes hanging around?
Andreas Haumer
andreas at xss.co.at
Thu Oct 4 07:32:34 EDT 2012
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi!
Since about one week I have a new machine (OpenSUSE 12.2, 64Bit)
running as IMAP server for about 30 users. The machine is moderately
loaded, there are about 10-15 users active during working hours.
OpenSUSE 12.2 comes with cyrus-imapd 2.3.18 as RPM
(cyrus-imapd-2.3.18-26.2.3.x86_64)
The machine was in preparation and test mode for about two months
(from August to September), is in production use since September
22nd and worked fine in the first days.
But since September 26th I see the number of imapd processes continuously
growing. As of today there are more than 2000 of them:
ravel:~ # ps ax | grep imapd | wc -l
2057
It seems since September 26th, imapd processes do not terminate anymore:
ravel:~ # ps aux | grep imapd | grep Sep23 | wc -l
0
ravel:~ # ps aux | grep imapd | grep Sep24 | wc -l
0
ravel:~ # ps aux | grep imapd | grep Sep25 | wc -l
0
ravel:~ # ps aux | grep imapd | grep Sep26 | wc -l
2
ravel:~ # ps aux | grep imapd | grep Sep27 | wc -l
145
ravel:~ # ps aux | grep imapd | grep Sep28 | wc -l
425
ravel:~ # ps aux | grep imapd | grep Sep29 | wc -l
288
ravel:~ # ps aux | grep imapd | grep Sep30 | wc -l
288
ravel:~ # ps aux | grep imapd | grep Okt01 | wc -l
323
ravel:~ # ps aux | grep imapd | grep Okt02 | wc -l
350
ravel:~ # ps aux | grep imapd | grep Okt03 | wc -l
82
Needless to say, I also have many TCP connections from impad active:
ravel:/var/log # netstat -atnp | grep imapd | wc -l
2059
For all those processes, in the logfiles typically I see this:
ravel:/var/log # fgrep 15139 messages-20120929
Sep 27 15:15:20 ravel master[15139]: about to exec /usr/lib/cyrus/bin/imapd
Sep 27 15:15:20 ravel imap[15139]: executed
Sep 27 15:15:20 ravel imap[15139]: IOERROR: opening /cluster/var/imap/user_deny.db: No such file or directory
Sep 27 15:17:42 ravel imap[15139]: accepted connection
Sep 27 15:17:42 ravel imap[15139]: imapd:Loading hard-coded DH parameters
Sep 27 15:17:42 ravel imap[15139]: SSL_accept() incomplete -> wait
Sep 27 15:17:42 ravel imap[15139]: SSL_accept() succeeded -> done
Sep 27 15:17:42 ravel imap[15139]: starttls: TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits new) no authentication
Sep 27 15:17:42 ravel imap[15139]: login: ws01 [192.168.1.51] office plain+TLS User logged in
Sep 27 15:17:42 ravel imap[15139]: created decompress buffer of 4102 bytes
Sep 27 15:17:42 ravel imap[15139]: created compress buffer of 4102 bytes
Sep 27 15:17:42 ravel imap[15139]: client id: "name" "Thunderbird" "version" "15.0"
And that's about it, nothing more from process #15139.
When I look with strace at the process, I see the following:
ravel:/var/log # strace -p 15139
Process 15139 attached
fcntl(16, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}
lsof shows:
ravel:/var/log # lsof -p 15139
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
imapd 15139 cyrus cwd DIR 253,1 4096 2 /
imapd 15139 cyrus rtd DIR 253,1 4096 2 /
imapd 15139 cyrus txt REG 253,4 1662944 663247 /usr/lib/cyrus/bin/imapd
imapd 15139 cyrus mem REG 253,1 27016 2829 /lib64/libnss_sss.so.2
imapd 15139 cyrus mem REG 253,1 56792 2898 /lib64/libnss_nis-2.15.so
imapd 15139 cyrus mem REG 253,1 108167 2822 /lib64/libnsl-2.15.so
imapd 15139 cyrus mem REG 253,1 38523 2844 /lib64/libnss_compat-2.15.so
imapd 15139 cyrus mem REG 253,1 27377 2835 /lib64/libnss_dns-2.15.so
imapd 15139 cyrus mem REG 253,1 14640 4517 /lib64/libnss_mdns_minimal.so.2
imapd 15139 cyrus mem REG 253,1 62418 2836 /lib64/libnss_files-2.15.so
imapd 15139 cyrus mem REG 253,4 23080 524326 /usr/lib64/sasl2/libcrammd5.so.2.0.25
imapd 15139 cyrus mem REG 253,4 18952 524325 /usr/lib64/sasl2/libplain.so.2.0.25
imapd 15139 cyrus mem REG 253,4 18952 524324 /usr/lib64/sasl2/liblogin.so.2.0.25
imapd 15139 cyrus mem REG 253,4 56360 524316 /usr/lib64/sasl2/libdigestmd5.so.2.0.25
imapd 15139 cyrus mem REG 253,4 18920 524318 /usr/lib64/sasl2/libanonymous.so.2.0.25
imapd 15139 cyrus mem REG 253,4 27032 524314 /usr/lib64/sasl2/libsasldb.so.2.0.25
imapd 15139 cyrus mem REG 253,4 257384 524297 /usr/lib64/libgssapi_krb5.so.2.2
imapd 15139 cyrus mem REG 253,4 31368 524327 /usr/lib64/sasl2/libgssapiv2.so.2.0.25
imapd 15139 cyrus mem REG 147,0 41844736 3670122 /cluster/var/imap/db/__db.005
imapd 15139 cyrus mem REG 147,0 4194304 3670121 /cluster/var/imap/db/__db.004
imapd 15139 cyrus mem REG 147,0 2629632 3670120 /cluster/var/imap/db/__db.003
imapd 15139 cyrus mem REG 147,0 6995968 3670055 /cluster/var/imap/db/__db.002
imapd 15139 cyrus mem REG 253,1 118080 2763 /lib64/libselinux.so.1
imapd 15139 cyrus mem REG 253,1 131219 2896 /lib64/libpthread-2.15.so
imapd 15139 cyrus mem REG 253,1 98059 2892 /lib64/libresolv-2.15.so
imapd 15139 cyrus mem REG 253,1 14608 2758 /lib64/libkeyutils.so.1.4
imapd 15139 cyrus mem REG 253,4 39776 541196 /usr/lib64/libkrb5support.so.0.1
imapd 15139 cyrus mem REG 253,4 162728 524985 /usr/lib64/libk5crypto.so.3.1
imapd 15139 cyrus mem REG 253,1 19067 2849 /lib64/libdl-2.15.so
imapd 15139 cyrus mem REG 253,1 1957616 2838 /lib64/libc-2.15.so
imapd 15139 cyrus mem REG 253,1 40944 2821 /lib64/libwrap.so.0.7.6
imapd 15139 cyrus mem REG 253,1 88304 2832 /lib64/libz.so.1.2.7
imapd 15139 cyrus mem REG 253,4 1556904 527149 /usr/lib64/libdb-4.8.so
imapd 15139 cyrus mem REG 253,1 1897944 2833 /lib64/libcrypto.so.1.0.0
imapd 15139 cyrus mem REG 253,1 423888 2899 /lib64/libssl.so.1.0.0
imapd 15139 cyrus mem REG 253,4 14768 527318 /usr/lib64/libcom_err.so.2.1
imapd 15139 cyrus mem REG 253,4 868264 524975 /usr/lib64/libkrb5.so.3.3
imapd 15139 cyrus mem REG 253,4 118736 524767 /usr/lib64/libsasl2.so.2.0.25
imapd 15139 cyrus mem REG 253,1 163471 2846 /lib64/ld-2.15.so
imapd 15139 cyrus DEL REG 147,0 3802117 /cluster/var/spool/imap/user/office/cyrus.cache
imapd 15139 cyrus DEL REG 147,0 3801204 /cluster/var/spool/imap/user/office/cyrus.index
imapd 15139 cyrus mem REG 147,0 215 3801660 /cluster/var/spool/imap/user/office/cyrus.header
imapd 15139 cyrus mem REG 147,0 101412 3685002 /cluster/var/imap/mailboxes.db
imapd 15139 cyrus mem REG 147,0 144 3684752 /cluster/var/imap/annotations.db
imapd 15139 cyrus mem REG 147,0 49152 3685000 /cluster/var/imap/db/__db.006
imapd 15139 cyrus mem REG 147,0 24576 3670034 /cluster/var/imap/db/__db.001
imapd 15139 cyrus 0u IPv4 11013449 0t0 TCP bartok:imap->ws01:57187 (CLOSE_WAIT)
imapd 15139 cyrus 1u IPv4 11013449 0t0 TCP bartok:imap->ws01:57187 (CLOSE_WAIT)
imapd 15139 cyrus 2u IPv4 11013449 0t0 TCP bartok:imap->ws01:57187 (CLOSE_WAIT)
imapd 15139 cyrus 3w FIFO 0,8 0t0 394112 pipe
imapd 15139 cyrus 4u IPv4 394110 0t0 TCP *:imap (LISTEN)
imapd 15139 cyrus 5u unix 0xffff88020fab0a80 0t0 11013424 socket
imapd 15139 cyrus 6u REG 147,0 101412 3685002 /cluster/var/imap/mailboxes.db
imapd 15139 cyrus 7u unix 0xffff880049d0ac00 0t0 11013425 socket
imapd 15139 cyrus 8u REG 147,0 144 3684752 /cluster/var/imap/annotations.db
imapd 15139 cyrus 9u REG 147,0 0 3670172 /cluster/var/imap/socket/imap-0.lock
imapd 15139 cyrus 10u REG 147,0 43 3686567 /cluster/var/imap/proc/15139
imapd 15139 cyrus 11u REG 147,0 3792896 3670949 /cluster/var/imap/tls_sessions.db
imapd 15139 cyrus 12u unix 0xffff8801dfb6f280 0t0 11021883 socket
imapd 15139 cyrus 13u REG 147,0 215 3801660 /cluster/var/spool/imap/user/office/cyrus.header
imapd 15139 cyrus 14u REG 147,0 976 3801204 /cluster/var/spool/imap/user/office/cyrus.index (deleted)
imapd 15139 cyrus 15u REG 147,0 16736 3802117 /cluster/var/spool/imap/user/office/cyrus.cache (deleted)
imapd 15139 cyrus 16u REG 147,0 57804 3686853 /cluster/var/imap/user/o/office.seen
So, fileid #16 points to /cluster/var/imap/user/o/office.seen
ravel:~ # ll /cluster/var/imap/user/o/office.seen
- -rw------- 1 cyrus mail 57804 27. Sep 14:11 /cluster/var/imap/user/o/office.seen
I checked about 20 processes and all of them hang on the F_SETLKW
fcntl() call on fileid #16 which always points to the same file
(/cluster/var/imap/user/o/office.seen)
/cluster is a ext4 filesystem on a DRBD device:
/dev/drbd0 on /cluster type ext4 (rw,relatime,data=ordered)
The DRBD device connects two identical machines running as
active/standby cluster (using pacemaker/corosync). The IMAP
server is configured as cluster resource using the LSB resource
script and was started on September 22nd and is running since then.
I am not aware of any change in software or configuration on
or around Septemer 26th, which might explain this phenomenon.
Users currently can still work with the system and do not see
and performance degradation (yet). Email delivery to the IMAP
server works, too (I'm using sendmail with LMTP)
I have worked with many cyrus-imapd installations in the past (clustered
and non-clustered) and have never seen such a phenomenon. Is this a bug
or a feature?
I don't think that this is a healty situation, I think something
is wrong here and should be corrected ASAP.
Has anyone seen something like this before?
Any idea on how to proceed with this?
Thanks!
- - andreas
- --
Andreas Haumer | mailto:andreas at xss.co.at
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
iD8DBQFQbXPPxJmyeGcXPhERAjnrAKCHEsZ3LaeAND678pZ4nqiL4n2EkACgsl/f
R/RpbMXmi02UPZKOAGn+mws=
=diYl
-----END PGP SIGNATURE-----
More information about the Info-cyrus
mailing list