Cyrus stops responding to imap:143 requests

Rodrigo Ventura yoda at isr.ist.utl.pt
Fri Oct 15 05:41:53 EDT 2010


Hello all,

for some time now we have been bugged with this problem: in our setup we have cyrus-imapd (on HOST1) listening to both imaps:993 and imap:143 ports, the former for the users, and the latter for a "saslauthd -a rimap" running on another host (on HOST2; for SMTP AUTH purposes). While the imaps:993 has been working flawlessly, from time to time imap:143 stops working. The socket is still there, LISTENing, but when telnet'ing it it does not respond:

HOST2$ telnet HOST1 imap
Trying HOST1...
Connected to HOST1.
Escape character is '^]'.

The connection is established:

HOST2$ netstat -ntpe | grep 84545334
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 HOST2:49774     HOST1:143       ESTABLISHED1000       84545334   2310/telnet

but on the other side:

HOST1# netstat -ntpae|grep :49774
tcp        0      0 HOST1:143       HOST2:49774     SYN_RECV    0          0          -

Now, on the HOST1 side, "master" preforks several imapd -s and imapd processes, but while there is a imapd -s listening to 993:

# netstat -ntpa|grep 0.0.0.0:993
tcp        0      0 0.0.0.0:993             0.0.0.0:*               LISTEN      725/imapd

it is master who is listening to 143:

# netstat -ntpa|grep 0.0.0.0:143
tcp        0      0 0.0.0.0:143             0.0.0.0:*               LISTEN      28090/master

But what is really strange is that master does not seem to include the LISTENing port in its select() call:

# netstat -ntpae | grep master
tcp        0      0 0.0.0.0:110             0.0.0.0:*               LISTEN      0          24593785   28090/master
tcp        0      0 0.0.0.0:143             0.0.0.0:*               LISTEN      0         *24593773*  28090/master
tcp        0      0 0.0.0.0:2003            0.0.0.0:*               LISTEN      0          24593803   28090/master
tcp        0      0 :::993                  :::*                    LISTEN      0          24593777   28090/master
tcp        0      0 :::995                  :::*                    LISTEN      0          24593789   28090/master
tcp        0      0 :::110                  :::*                    LISTEN      0          24593783   28090/master
tcp        0      0 :::143                  :::*                    LISTEN      0          24593771   28090/master
tcp        0      0 :::2000                 :::*                    LISTEN      0          24593795   28090/master
tcp        0      0 :::2003                 :::*                    LISTEN      0          24593801   28090/master

# ls -laF /proc/28090/fd|grep socket
lrwx------ 1 root  root 64 Oct 15 09:46*10*-> socket:[24593773] <<<
lrwx------ 1 root  root 64 Oct 15 09:46 13 -> socket:[24593777]
lrwx------ 1 root  root 64 Oct 15 09:46 16 -> socket:[24593779]
lrwx------ 1 root  root 64 Oct 15 09:46 19 -> socket:[24593783]
lrwx------ 1 root  root 64 Oct 15 09:46 22 -> socket:[24593785]
lrwx------ 1 root  root 64 Oct 15 09:46 25 -> socket:[24593789]
lrwx------ 1 root  root 64 Oct 15 09:46 28 -> socket:[24593791]
lrwx------ 1 root  root 64 Oct 15 09:46 31 -> socket:[24593795]
lrwx------ 1 root  root 64 Oct 15 09:46 34 -> socket:[24593797]
lrwx------ 1 root  root 64 Oct 15 09:46 37 -> socket:[24593801]
lrwx------ 1 root  root 64 Oct 15 09:46 40 -> socket:[24593803]
lrwx------ 1 root  root 64 Oct 15 09:46 43 -> socket:[24593805]
lrwx------ 1 root  root 64 Oct 15 09:46 46 -> socket:[24593808]
lrwx------ 1 root  root 64 Oct 15 09:46 5 -> socket:[24593753]
lrwx------ 1 root  root 64 Oct 15 09:46 7 -> socket:[24593771]

# strace -p 28090
Process 28090 attached - interrupt to quit
select(48, [8 11 14 17 20 23 26 29 32 35 38 41 44 47], NULL, NULL, {3, 544000}) = 1 (in [47], left {2, 300000})
read(47, "\1\0\0\0\2330\0\0", 8)        = 8
[...]

The dirty way of solving this is to kill and restart master again, but in the meantime our users are no longer able to use SMTP AUTH from HOST2...

Any clues of what is going on here?

Cheers,

Rodrigo Ventura
ISR / IST

PS: after restarting master, it is still not including 143 on its select(), but someone is responding to 143:

HOST2$ telnet HOST1 143
Trying HOST1...
Connected to HOST1.
Escape character is '^]'.
* OK [CAPABILITY IMAP4 IMAP4rev1 LITERAL+ ID STARTTLS AUTH=PLAIN AUTH=OTP AUTH=DIGEST-MD5 AUTH=CRAM-MD5 SASL-IR COMPRESS=DEFLATE] HOST1 Cyrus IMAP v2.3.16 server ready

HOST2$ netstat -ntpa|grep 143
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 HOST2:33584     HOST1:143       ESTABLISHED3398/telnet      

HOST1# netstat -ntpae|grep :33584
tcp        0      0 HOST1:143       HOST2:33584     ESTABLISHED 96         27619330   18847/imapd 




More information about the Info-cyrus mailing list