Followup: Problem with Mulberry, cyrus-imapd 2.2.8 and DIGEST-MD5 (long!)

Sebastian Hagedorn Hagedorn at uni-koeln.de
Tue Oct 19 08:35:12 EDT 2004


Hi,

I found some more info regarding this:

--On Dienstag, 19. Oktober 2004 8:39 Uhr +0200 Sebastian Hagedorn 
<Hagedorn at uni-koeln.de> wrote:

> on Saturday we upgraded our Cyrus IMAP server from 2.1.16 to 2.2.8.
> Everything went smoothly, but since then some colleagues and I have
> noticed that at the very least there seems to be a new issue with
> Mulberry 3.1.6 and DIGEST-MD5 authentication. Every once in a while an
> authentication attempt will fail with the following message (a screenshot
> is attached):
>
> Could not login to server: <39> DIGEST-MD5 server step 1+....
>
> If you try again a bit later, the authentication succeeds. This never
> happened with the older version of cyrus-imapd. I remember that
> previously there were other issues with Mulberry and DIGEST-MD5 (e.g. for
> managesieve authentication), but this seems to be something new.

Meanwhile I've found that other clients have problems as well and the 
mechanism doesn't seem to matter. What's going on seems to be this:
for some reason imapd process (possibly others as well) go into a weird 
state where they don't work properly anymore:

[root at lvr13 sasltmp]# zgrep 16751 /var/log/imapd.log.1.gz
Oct 18 09:38:17 lvr13 master[16751]: about to exec 
/usr/lib/cyrus-imapd/imapd
Oct 18 09:38:17 lvr13 imap[16751]: executed
Oct 18 09:39:00 lvr13 imap[16751]: accepted connection
Oct 18 09:39:00 lvr13 imap[16751]: login: lvr6.rrz.uni-koeln.de 
[134.95.19.104] gotzhein CRAM-MD5 User logged in
Oct 18 09:39:00 lvr13 imap[16751]: seen_db: user gotzhein opened 
/var/lib/imap/user/Q/gotzhein.seen
Oct 18 09:39:00 lvr13 imap[16751]: open: user gotzhein opened INBOX
Oct 18 09:39:23 lvr13 imap[16751]: accepted connection
Oct 18 09:39:23 lvr13 imap[16751]: login: lvr6.rrz.uni-koeln.de 
[134.95.19.104] schindlm CRAM-MD5 User logged in
Oct 18 09:39:23 lvr13 imap[16751]: seen_db: user schindlm opened 
/var/lib/imap/user/U/schindlm.seen
Oct 18 09:39:23 lvr13 imap[16751]: open: user schindlm opened INBOX
Oct 18 09:39:23 lvr13 imap[16751]: SQUAT returned 270 messages
Oct 18 09:40:01 lvr13 imap[16751]: accepted connection
Oct 18 09:40:07 lvr13 imap[16751]: login: lvr6.rrz.uni-koeln.de 
[134.95.19.104] gotzhein CRAM-MD5 User logged in
Oct 18 09:42:29 lvr13 imap[16751]: accepted connection
Oct 18 09:43:36 lvr13 imap[16751]: accepted connection
Oct 18 09:46:13 lvr13 imap[16751]: accepted connection
Oct 18 09:51:40 lvr13 imap[16751]: accepted connection
Oct 18 09:59:04 lvr13 imap[16751]: accepted connection
Oct 18 10:04:53 lvr13 imap[16751]: accepted connection
...

This goes on even now. So the imapd with PID 16751 still *answers* 
connections, but they all fail somehow. It's pure chance if a given 
incoming connection is mapped to such a "broken" process. I've tried to 
find out some more (lsof output is edited for clarity).

lsof -p 16751
COMMAND   PID  USER   FD   TYPE    DEVICE     SIZE     NODE NAME
imapd   16751 cyrus  cwd    DIR    254,33     4096 29261830 
/var/spool/imap/U/user/schindlm
imapd   16751 cyrus  rtd    DIR       8,1     4096        2 /
imapd   16751 cyrus  txt    REG       8,1   966936   150307 
/usr/lib/cyrus-imapd/imapd
imapd   16751 cyrus  mem    REG    254,25      144       65 
/var/lib/imap/annotations.db
imapd   16751 cyrus  mem    REG    254,25  5198220       31 
/var/lib/imap/mailboxes.db
iimapd   16751 cyrus    0u  IPv4 274515268               TCP 
cyrus.rrz.uni-koeln.de:imap->lvr6.rrz.uni-koeln.de:58346 (ESTABLISHED)
imapd   16751 cyrus    1u  IPv4 274515268               TCP 
cyrus.rrz.uni-koeln.de:imap->lvr6.rrz.uni-koeln.de:58346 (ESTABLISHED)
imapd   16751 cyrus    2u  IPv4 274515268               TCP 
cyrus.rrz.uni-koeln.de:imap->lvr6.rrz.uni-koeln.de:58346 (ESTABLISHED)
imapd   16751 cyrus    3w  FIFO       0,5             43136 pipe
imapd   16751 cyrus    4u  IPv4     43135               TCP 
cyrus.rrz.uni-koeln.de:imap (LISTEN)
imapd   16751 cyrus    5u   REG    254,25       46   195975 
/var/lib/imap/proc/16751
imapd   16751 cyrus    6u   REG    254,25  5198220       31 
/var/lib/imap/mailboxes.db
imapd   16751 cyrus    7u   REG    254,25      144       65 
/var/lib/imap/annotations.db
imapd   16751 cyrus    8u   REG    254,25        0  2072662 
/var/lib/imap/socket/imap-0.lock
imapd   16751 cyrus    9u   REG    254,25   716800       63 
/var/lib/imap/tls_sessions.db
imapd   16751 cyrus   10u   REG    254,25 10485700    97948 
/var/lib/imap/db/log.0000000825 (deleted)
imapd   16751 cyrus   13u   REG    254,25     6880  2431892 
/var/lib/imap/user/U/schindlm.seen

So the 'seen' file of a previous session is still open. It does not seem to 
be involved in a deadlock: 'grep 2431892 /proc/locks' doesn't show 
anything. I'm not sure how relevant that may be. The TCP connections vary 
and obviously *some* kind of exchange takes place with the hapless clients 
that connect to this imapd.

[root at lvr13 sasltmp]# strace -t -p 16751
Process 16751 attached - interrupt to quit
14:14:51 select(1, [0], NULL, NULL, {1688, 630000}) = 1 (in [0], left 
{1491, 920000})
14:18:08 time(NULL)                     = 1098188288
14:18:08 read(0, "", 4096)              = 0
14:18:08 close(5)                       = 0
14:18:08 munmap(0xb49ec000, 4096)       = 0
14:18:08 unlink("/var/lib/imap/proc/16751") = 0
14:18:08 shutdown(0, 0 /* receive */)   = 0
14:18:08 close(0)                       = 0
14:18:08 shutdown(1, 0 /* receive */)   = 0
14:18:08 close(1)                       = 0
14:18:08 shutdown(2, 0 /* receive */)   = 0
14:18:08 close(2)                       = 0
14:18:08 getpid()                       = 16751
14:18:08 write(3, "\1\0\0\0oA\0\0", 8)  = 8
14:18:08 rt_sigaction(SIGALRM, {0x804ccd0, [], SA_RESTORER|SA_ONESHOT, 
0xb729ceb8}, NULL, 8) = 0
14:18:08 alarm(60)                      = 0
14:18:08 fcntl64(8, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, 
len=0}) = 0

So after a timeout everything is closed and the process waits for the next 
connection:

14:19:43 stat64("/usr/lib/cyrus-imapd/imapd", {st_mode=S_IFREG|0755, 
st_size=966936, ...}) = 0
14:19:43 accept(4, 0, NULL)             = 0
14:19:48 fcntl64(8, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, 
len=0}) = 0
14:19:48 alarm(0)                       = 0
14:19:48 getpid()                       = 16751
14:19:48 getpeername(0, {sa_family=AF_INET, sin_port=htons(60136), 
sin_addr=inet_addr("134.95.19.104")}, [16]) = 0
14:19:48 getpeername(0, {sa_family=AF_INET, sin_port=htons(60136), 
sin_addr=inet_addr("134.95.19.104")}, [16]) = 0
14:19:48 getsockname(0, {sa_family=AF_INET, sin_port=htons(143), 
sin_addr=inet_addr("134.95.19.46")}, [16]) = 0
...
14:19:48 select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
14:19:48 write(1, "* OK cyrus.rrz.uni-koeln.de Cyru"..., 80) = 80
14:19:48 time(NULL)                     = 1098188388
14:19:48 select(1, [0], NULL, NULL, {1800, 0}) = 1 (in [0], left {1800, 0})
14:19:48 time(NULL)                     = 1098188388
14:19:48 read(0, "00000000 CAPABILITY\r\n", 4096) = 21
14:19:48 select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
14:19:48 write(1, "* CAPABILITY IMAP4 IMAP4rev1 ACL"..., 334) = 334
14:19:48 time(NULL)                     = 1098188388
14:19:48 select(1, [0], NULL, NULL, {1800, 0}) = 1 (in [0], left {1800, 0})
14:19:48 time(NULL)                     = 1098188388
14:19:48 read(0, "00000001 AUTHENTICATE CRAM-MD5\r\n", 4096) = 32
14:19:48 time(NULL)                     = 1098188388
14:19:48 open("/dev/urandom", O_RDONLY) = 11
14:19:48 read(11, "C\333e\357\365\206", 6) = 6
14:19:48 close(11)                      = 0
14:19:48 gettimeofday({1098188388, 922756}, NULL) = 0
14:19:48 times({tms_utime=231, tms_stime=846, tms_cutime=0, tms_cstime=0}) 
= 17965478
14:19:48 select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
14:19:48 write(1, "+ PDE0OTcyNzIwOTEuNzY2OTQxM0BjeX"..., 64) = 64
14:19:48 time(NULL)                     = 1098188388
14:19:48 select(1, [0], NULL, NULL, {1800, 0}) = 1 (in [0], left {1800, 0})
14:19:48 time(NULL)                     = 1098188388
14:19:48 read(0, "aGVpbmVuaiA3MzFlMzk3OGJmM2U0YjQ2"..., 4096) = 58
...
14:19:50 write(5, "lvr6.rrz.uni-koeln.de [134.95.19"..., 46) = 46
14:19:50 _llseek(5, 0, [46], SEEK_CUR)  = 0
14:19:50 ftruncate(5, 46)               = 0
14:19:50 time([1098188390])             = 1098188390
14:19:50 getpid()                       = 16751
14:19:50 rt_sigaction(SIGPIPE, {0xb734d8e0, [], SA_RESTORER, 0xb729ceb8}, 
{SIG_IGN}, 8) = 0
14:19:50 send(1, "<181>Oct 19 14:19:50 imap[16751]"..., 110, 0) = 110
14:19:50 rt_sigaction(SIGPIPE, {SIG_IGN}, NULL, 8) = 0
14:19:50 write(1, "00000001 OK Success (no protecti"..., 37) = 37
14:19:50 getpid()                       = 16751
14:19:50 open("/var/lib/imap/log/heinenj/16751", O_WRONLY|O_APPEND|O_CREAT, 
0644) = -1 ENOENT (No such file or directory)
14:19:50 open("/var/lib/imap/msg/shutdown", O_RDONLY) = -1 ENOENT (No such 
file or directory)
14:19:50 time(NULL)                     = 1098188390
14:19:50 select(1, [0], NULL, NULL, {1800, 0} <unfinished ...>

At this point it *looks* as though the authentication went fine, but 
nothing further happens until the next timeout. I have no idea what to make 
of this. I'd be grateful for any pointers.

Greetings, Sebastian Hagedorn
--
Sebastian Hagedorn M.A. - RZKR-R1 (Gebäude 52), Zimmer 18
Zentrum für angewandte Informatik - Universitätsweiter Service RRZK
Universität zu Köln / Cologne University - Tel. +49-221-478-5587
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
Url : https://lists.andrew.cmu.edu/mailman/private/info-cyrus/attachments/20041019/903bd89c/attachment.bin


More information about the Info-cyrus mailing list