Cyrus stops responding to imap:143 requests

Eric Luyten Eric.Luyten at vub.ac.be
Wed Oct 27 09:03:18 EDT 2010


On Fri, October 15, 2010 11:41 am, Rodrigo Ventura wrote:
>

> Hello all,
>
>
> for some time now we have been bugged with this problem: in our setup we have
> cyrus-imapd (on HOST1) listening to both imaps:993 and imap:143 ports, the
> former for the users, and the latter for a "saslauthd -a rimap" running on
> another host (on HOST2; for SMTP AUTH purposes). While the imaps:993 has been
> working flawlessly, from time to time imap:143 stops working. The socket is
> still there, LISTENing, but when telnet'ing it it does not respond:
>
> HOST2$ telnet HOST1 imap
> Trying HOST1...
> Connected to HOST1.
> Escape character is '^]'.
>
>
> The connection is established:
>
>
> HOST2$ netstat -ntpe | grep 84545334
> (Not all processes could be identified, non-owned process info
> will not be shown, you would have to be root to see it all.) tcp        0
> 0 HOST2:49774     HOST1:143       ESTABLISHED1000       84545334
> 2310/telnet
>
>
> but on the other side:
>
> HOST1# netstat -ntpae|grep :49774
> tcp        0      0 HOST1:143       HOST2:49774     SYN_RECV    0          0
> -
>
>
> Now, on the HOST1 side, "master" preforks several imapd -s and imapd
> processes, but while there is a imapd -s listening to 993:
>
> # netstat -ntpa|grep 0.0.0.0:993
> tcp        0      0 0.0.0.0:993             0.0.0.0:*               LISTEN
> 725/imapd
>
>
> it is master who is listening to 143:
>
> # netstat -ntpa|grep 0.0.0.0:143
> tcp        0      0 0.0.0.0:143             0.0.0.0:*               LISTEN
> 28090/master
>
>
> But what is really strange is that master does not seem to include the
> LISTENing port in its select() call:
>
>
> # netstat -ntpae | grep master
> tcp        0      0 0.0.0.0:110             0.0.0.0:*               LISTEN
> 0          24593785   28090/master tcp        0      0 0.0.0.0:143
> 0.0.0.0:*               LISTEN      0         *24593773*  28090/master
> tcp        0      0 0.0.0.0:2003            0.0.0.0:*               LISTEN
> 0          24593803   28090/master tcp        0      0 :::993
> :::*                    LISTEN      0          24593777   28090/master
> tcp        0      0 :::995                  :::*                    LISTEN
> 0          24593789   28090/master tcp        0      0 :::110
> :::*                    LISTEN      0          24593783   28090/master
> tcp        0      0 :::143                  :::*                    LISTEN
> 0          24593771   28090/master tcp        0      0 :::2000
> :::*                    LISTEN      0          24593795   28090/master
> tcp        0      0 :::2003                 :::*                    LISTEN
> 0          24593801   28090/master
>
> # ls -laF /proc/28090/fd|grep socket
> lrwx------ 1 root  root 64 Oct 15 09:46*10*-> socket:[24593773] <<< lrwx------
> 1 root  root 64 Oct 15 09:46 13 -> socket:[24593777]
> lrwx------ 1 root  root 64 Oct 15 09:46 16 -> socket:[24593779] lrwx------ 1
> root  root 64 Oct 15 09:46 19 -> socket:[24593783] lrwx------ 1 root  root 64
> Oct 15 09:46 22 -> socket:[24593785]
> lrwx------ 1 root  root 64 Oct 15 09:46 25 -> socket:[24593789] lrwx------ 1
> root  root 64 Oct 15 09:46 28 -> socket:[24593791] lrwx------ 1 root  root 64
> Oct 15 09:46 31 -> socket:[24593795]
> lrwx------ 1 root  root 64 Oct 15 09:46 34 -> socket:[24593797] lrwx------ 1
> root  root 64 Oct 15 09:46 37 -> socket:[24593801] lrwx------ 1 root  root 64
> Oct 15 09:46 40 -> socket:[24593803]
> lrwx------ 1 root  root 64 Oct 15 09:46 43 -> socket:[24593805] lrwx------ 1
> root  root 64 Oct 15 09:46 46 -> socket:[24593808] lrwx------ 1 root  root 64
> Oct 15 09:46 5 -> socket:[24593753]
> lrwx------ 1 root  root 64 Oct 15 09:46 7 -> socket:[24593771]
>
> # strace -p 28090
> Process 28090 attached - interrupt to quit
> select(48, [8 11 14 17 20 23 26 29 32 35 38 41 44 47], NULL, NULL, {3,
> 544000}) = 1 (in [47], left {2, 300000})
> read(47, "\1\0\0\0\2330\0\0", 8)        = 8 [...]
>
>
> The dirty way of solving this is to kill and restart master again, but in the
> meantime our users are no longer able to use SMTP AUTH from HOST2...
>
> Any clues of what is going on here?
>
>
> Cheers,
>
>
> Rodrigo Ventura
> ISR / IST
>
>
> PS: after restarting master, it is still not including 143 on its select(),
> but someone is responding to 143:
>
> HOST2$ telnet HOST1 143
> Trying HOST1...
> Connected to HOST1.
> Escape character is '^]'.
> * OK [CAPABILITY IMAP4 IMAP4rev1 LITERAL+ ID STARTTLS AUTH=PLAIN AUTH=OTP
> AUTH=DIGEST-MD5 AUTH=CRAM-MD5 SASL-IR COMPRESS=DEFLATE] HOST1 Cyrus IMAP
> v2.3.16 server ready
>
> HOST2$ netstat -ntpa|grep 143
> (Not all processes could be identified, non-owned process info
> will not be shown, you would have to be root to see it all.) tcp        0
> 0 HOST2:33584     HOST1:143       ESTABLISHED3398/telnet
>
> HOST1# netstat -ntpae|grep :33584
> tcp        0      0 HOST1:143       HOST2:33584     ESTABLISHED 96
> 27619330   18847/imapd
>


Rodrigo,


What does your /etc/cyrus.conf look like ?
In particular, pay attention to the way you make the Cyrus master
distinguish between different service names.

On 2.3.15, we run into a very similar situation whereby I had made
a distinction between two services using a suffix composed of a
hyphen and some text, but Cyrus apparently only used the part before
the hyphen in its internal housekeeping.
Half of the time one service was responding, half of the time the
other one. Very annoying.

To make things clearer :

   problem situation :

        popserv       cmd="pop3d -C ..."
        popserv-sec   cmd="pop3d -s -C ..."

   problemless situation :

        popserv       cmd="pop3d -C ..."
        popservsec    cmd="pop3d -s -C ..."

Finding the solution was helped by us noticing that syslogging was
also only done using the first part of the service name (popserv).



Hope this helps,
Eric Luyten, Computing Centre VUB/ULB.




More information about the Info-cyrus mailing list