Bug or feature: (Too) Many imapd processes hanging around?

Andreas Haumer andreas at xss.co.at
Thu Oct 4 07:32:34 EDT 2012


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Since about one week I have a new machine (OpenSUSE 12.2, 64Bit)
running as IMAP server for about 30 users. The machine is moderately
loaded, there are about 10-15 users active during working hours.

OpenSUSE 12.2 comes with cyrus-imapd 2.3.18 as RPM
(cyrus-imapd-2.3.18-26.2.3.x86_64)

The machine was in preparation and test mode for about two months
(from August to September), is in production use since September
22nd and worked fine in the first days.

But since September 26th I see the number of imapd processes continuously
growing. As of today there are more than 2000 of them:

ravel:~ # ps ax | grep imapd | wc -l
2057

It seems since September 26th, imapd processes do not terminate anymore:

ravel:~ # ps aux | grep imapd | grep Sep23 | wc -l
0
ravel:~ # ps aux | grep imapd | grep Sep24 | wc -l
0
ravel:~ # ps aux | grep imapd | grep Sep25 | wc -l
0
ravel:~ # ps aux | grep imapd | grep Sep26 | wc -l
2
ravel:~ # ps aux | grep imapd | grep Sep27 | wc -l
145
ravel:~ # ps aux | grep imapd | grep Sep28 | wc -l
425
ravel:~ # ps aux | grep imapd | grep Sep29 | wc -l
288
ravel:~ # ps aux | grep imapd | grep Sep30 | wc -l
288
ravel:~ # ps aux | grep imapd | grep Okt01 | wc -l
323
ravel:~ # ps aux | grep imapd | grep Okt02 | wc -l
350
ravel:~ # ps aux | grep imapd | grep Okt03 | wc -l
82

Needless to say, I also have many TCP connections from impad active:

ravel:/var/log # netstat -atnp | grep imapd | wc -l
2059

For all those processes, in the logfiles typically I see this:

ravel:/var/log # fgrep 15139 messages-20120929
Sep 27 15:15:20 ravel master[15139]: about to exec /usr/lib/cyrus/bin/imapd
Sep 27 15:15:20 ravel imap[15139]: executed
Sep 27 15:15:20 ravel imap[15139]: IOERROR: opening /cluster/var/imap/user_deny.db: No such file or directory
Sep 27 15:17:42 ravel imap[15139]: accepted connection
Sep 27 15:17:42 ravel imap[15139]: imapd:Loading hard-coded DH parameters
Sep 27 15:17:42 ravel imap[15139]: SSL_accept() incomplete -> wait
Sep 27 15:17:42 ravel imap[15139]: SSL_accept() succeeded -> done
Sep 27 15:17:42 ravel imap[15139]: starttls: TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits new) no authentication
Sep 27 15:17:42 ravel imap[15139]: login: ws01 [192.168.1.51] office plain+TLS User logged in
Sep 27 15:17:42 ravel imap[15139]: created decompress buffer of 4102 bytes
Sep 27 15:17:42 ravel imap[15139]: created compress buffer of 4102 bytes
Sep 27 15:17:42 ravel imap[15139]: client id: "name" "Thunderbird" "version" "15.0"

And that's about it, nothing more from process #15139.

When I look with strace at the process, I see the following:

ravel:/var/log # strace -p 15139
Process 15139 attached
fcntl(16, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}

lsof shows:

ravel:/var/log # lsof -p 15139
COMMAND   PID  USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
imapd   15139 cyrus  cwd    DIR              253,1     4096        2 /
imapd   15139 cyrus  rtd    DIR              253,1     4096        2 /
imapd   15139 cyrus  txt    REG              253,4  1662944   663247 /usr/lib/cyrus/bin/imapd
imapd   15139 cyrus  mem    REG              253,1    27016     2829 /lib64/libnss_sss.so.2
imapd   15139 cyrus  mem    REG              253,1    56792     2898 /lib64/libnss_nis-2.15.so
imapd   15139 cyrus  mem    REG              253,1   108167     2822 /lib64/libnsl-2.15.so
imapd   15139 cyrus  mem    REG              253,1    38523     2844 /lib64/libnss_compat-2.15.so
imapd   15139 cyrus  mem    REG              253,1    27377     2835 /lib64/libnss_dns-2.15.so
imapd   15139 cyrus  mem    REG              253,1    14640     4517 /lib64/libnss_mdns_minimal.so.2
imapd   15139 cyrus  mem    REG              253,1    62418     2836 /lib64/libnss_files-2.15.so
imapd   15139 cyrus  mem    REG              253,4    23080   524326 /usr/lib64/sasl2/libcrammd5.so.2.0.25
imapd   15139 cyrus  mem    REG              253,4    18952   524325 /usr/lib64/sasl2/libplain.so.2.0.25
imapd   15139 cyrus  mem    REG              253,4    18952   524324 /usr/lib64/sasl2/liblogin.so.2.0.25
imapd   15139 cyrus  mem    REG              253,4    56360   524316 /usr/lib64/sasl2/libdigestmd5.so.2.0.25
imapd   15139 cyrus  mem    REG              253,4    18920   524318 /usr/lib64/sasl2/libanonymous.so.2.0.25
imapd   15139 cyrus  mem    REG              253,4    27032   524314 /usr/lib64/sasl2/libsasldb.so.2.0.25
imapd   15139 cyrus  mem    REG              253,4   257384   524297 /usr/lib64/libgssapi_krb5.so.2.2
imapd   15139 cyrus  mem    REG              253,4    31368   524327 /usr/lib64/sasl2/libgssapiv2.so.2.0.25
imapd   15139 cyrus  mem    REG              147,0 41844736  3670122 /cluster/var/imap/db/__db.005
imapd   15139 cyrus  mem    REG              147,0  4194304  3670121 /cluster/var/imap/db/__db.004
imapd   15139 cyrus  mem    REG              147,0  2629632  3670120 /cluster/var/imap/db/__db.003
imapd   15139 cyrus  mem    REG              147,0  6995968  3670055 /cluster/var/imap/db/__db.002
imapd   15139 cyrus  mem    REG              253,1   118080     2763 /lib64/libselinux.so.1
imapd   15139 cyrus  mem    REG              253,1   131219     2896 /lib64/libpthread-2.15.so
imapd   15139 cyrus  mem    REG              253,1    98059     2892 /lib64/libresolv-2.15.so
imapd   15139 cyrus  mem    REG              253,1    14608     2758 /lib64/libkeyutils.so.1.4
imapd   15139 cyrus  mem    REG              253,4    39776   541196 /usr/lib64/libkrb5support.so.0.1
imapd   15139 cyrus  mem    REG              253,4   162728   524985 /usr/lib64/libk5crypto.so.3.1
imapd   15139 cyrus  mem    REG              253,1    19067     2849 /lib64/libdl-2.15.so
imapd   15139 cyrus  mem    REG              253,1  1957616     2838 /lib64/libc-2.15.so
imapd   15139 cyrus  mem    REG              253,1    40944     2821 /lib64/libwrap.so.0.7.6
imapd   15139 cyrus  mem    REG              253,1    88304     2832 /lib64/libz.so.1.2.7
imapd   15139 cyrus  mem    REG              253,4  1556904   527149 /usr/lib64/libdb-4.8.so
imapd   15139 cyrus  mem    REG              253,1  1897944     2833 /lib64/libcrypto.so.1.0.0
imapd   15139 cyrus  mem    REG              253,1   423888     2899 /lib64/libssl.so.1.0.0
imapd   15139 cyrus  mem    REG              253,4    14768   527318 /usr/lib64/libcom_err.so.2.1
imapd   15139 cyrus  mem    REG              253,4   868264   524975 /usr/lib64/libkrb5.so.3.3
imapd   15139 cyrus  mem    REG              253,4   118736   524767 /usr/lib64/libsasl2.so.2.0.25
imapd   15139 cyrus  mem    REG              253,1   163471     2846 /lib64/ld-2.15.so
imapd   15139 cyrus  DEL    REG              147,0           3802117 /cluster/var/spool/imap/user/office/cyrus.cache
imapd   15139 cyrus  DEL    REG              147,0           3801204 /cluster/var/spool/imap/user/office/cyrus.index
imapd   15139 cyrus  mem    REG              147,0      215  3801660 /cluster/var/spool/imap/user/office/cyrus.header
imapd   15139 cyrus  mem    REG              147,0   101412  3685002 /cluster/var/imap/mailboxes.db
imapd   15139 cyrus  mem    REG              147,0      144  3684752 /cluster/var/imap/annotations.db
imapd   15139 cyrus  mem    REG              147,0    49152  3685000 /cluster/var/imap/db/__db.006
imapd   15139 cyrus  mem    REG              147,0    24576  3670034 /cluster/var/imap/db/__db.001
imapd   15139 cyrus    0u  IPv4           11013449      0t0      TCP bartok:imap->ws01:57187 (CLOSE_WAIT)
imapd   15139 cyrus    1u  IPv4           11013449      0t0      TCP bartok:imap->ws01:57187 (CLOSE_WAIT)
imapd   15139 cyrus    2u  IPv4           11013449      0t0      TCP bartok:imap->ws01:57187 (CLOSE_WAIT)
imapd   15139 cyrus    3w  FIFO                0,8      0t0   394112 pipe
imapd   15139 cyrus    4u  IPv4             394110      0t0      TCP *:imap (LISTEN)
imapd   15139 cyrus    5u  unix 0xffff88020fab0a80      0t0 11013424 socket
imapd   15139 cyrus    6u   REG              147,0   101412  3685002 /cluster/var/imap/mailboxes.db
imapd   15139 cyrus    7u  unix 0xffff880049d0ac00      0t0 11013425 socket
imapd   15139 cyrus    8u   REG              147,0      144  3684752 /cluster/var/imap/annotations.db
imapd   15139 cyrus    9u   REG              147,0        0  3670172 /cluster/var/imap/socket/imap-0.lock
imapd   15139 cyrus   10u   REG              147,0       43  3686567 /cluster/var/imap/proc/15139
imapd   15139 cyrus   11u   REG              147,0  3792896  3670949 /cluster/var/imap/tls_sessions.db
imapd   15139 cyrus   12u  unix 0xffff8801dfb6f280      0t0 11021883 socket
imapd   15139 cyrus   13u   REG              147,0      215  3801660 /cluster/var/spool/imap/user/office/cyrus.header
imapd   15139 cyrus   14u   REG              147,0      976  3801204 /cluster/var/spool/imap/user/office/cyrus.index (deleted)
imapd   15139 cyrus   15u   REG              147,0    16736  3802117 /cluster/var/spool/imap/user/office/cyrus.cache (deleted)
imapd   15139 cyrus   16u   REG              147,0    57804  3686853 /cluster/var/imap/user/o/office.seen

So, fileid #16 points to /cluster/var/imap/user/o/office.seen

ravel:~ # ll /cluster/var/imap/user/o/office.seen
- -rw------- 1 cyrus mail 57804 27. Sep 14:11 /cluster/var/imap/user/o/office.seen


I checked about 20 processes and all of them hang on the F_SETLKW
fcntl() call on fileid #16 which always points to the same file
(/cluster/var/imap/user/o/office.seen)

/cluster is a ext4 filesystem on a DRBD device:

/dev/drbd0 on /cluster type ext4 (rw,relatime,data=ordered)

The DRBD device connects two identical machines running as
active/standby cluster (using pacemaker/corosync). The IMAP
server is configured as cluster resource using the LSB resource
script and was started on September 22nd and is running since then.

I am not aware of any change in software or configuration on
or around Septemer 26th, which might explain this phenomenon.

Users currently can still work with the system and do not see
and performance degradation (yet). Email delivery to the IMAP
server works, too (I'm using sendmail with LMTP)

I have worked with many cyrus-imapd installations in the past (clustered
and non-clustered) and have never seen such a phenomenon. Is this a bug
or a feature?

I don't think that this is a healty situation, I think something
is wrong here and should be corrected ASAP.

Has anyone seen something like this before?

Any idea on how to proceed with this?

Thanks!

- - andreas

- -- 
Andreas Haumer                     | mailto:andreas at xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iD8DBQFQbXPPxJmyeGcXPhERAjnrAKCHEsZ3LaeAND678pZ4nqiL4n2EkACgsl/f
R/RpbMXmi02UPZKOAGn+mws=
=diYl
-----END PGP SIGNATURE-----


More information about the Info-cyrus mailing list