Followup: Problem with Mulberry, cyrus-imapd 2.2.8 and DIGEST-MD5
(long!)
Sebastian Hagedorn
Hagedorn at uni-koeln.de
Tue Oct 19 08:35:12 EDT 2004
Hi,
I found some more info regarding this:
--On Dienstag, 19. Oktober 2004 8:39 Uhr +0200 Sebastian Hagedorn
<Hagedorn at uni-koeln.de> wrote:
> on Saturday we upgraded our Cyrus IMAP server from 2.1.16 to 2.2.8.
> Everything went smoothly, but since then some colleagues and I have
> noticed that at the very least there seems to be a new issue with
> Mulberry 3.1.6 and DIGEST-MD5 authentication. Every once in a while an
> authentication attempt will fail with the following message (a screenshot
> is attached):
>
> Could not login to server: <39> DIGEST-MD5 server step 1+....
>
> If you try again a bit later, the authentication succeeds. This never
> happened with the older version of cyrus-imapd. I remember that
> previously there were other issues with Mulberry and DIGEST-MD5 (e.g. for
> managesieve authentication), but this seems to be something new.
Meanwhile I've found that other clients have problems as well and the
mechanism doesn't seem to matter. What's going on seems to be this:
for some reason imapd process (possibly others as well) go into a weird
state where they don't work properly anymore:
[root at lvr13 sasltmp]# zgrep 16751 /var/log/imapd.log.1.gz
Oct 18 09:38:17 lvr13 master[16751]: about to exec
/usr/lib/cyrus-imapd/imapd
Oct 18 09:38:17 lvr13 imap[16751]: executed
Oct 18 09:39:00 lvr13 imap[16751]: accepted connection
Oct 18 09:39:00 lvr13 imap[16751]: login: lvr6.rrz.uni-koeln.de
[134.95.19.104] gotzhein CRAM-MD5 User logged in
Oct 18 09:39:00 lvr13 imap[16751]: seen_db: user gotzhein opened
/var/lib/imap/user/Q/gotzhein.seen
Oct 18 09:39:00 lvr13 imap[16751]: open: user gotzhein opened INBOX
Oct 18 09:39:23 lvr13 imap[16751]: accepted connection
Oct 18 09:39:23 lvr13 imap[16751]: login: lvr6.rrz.uni-koeln.de
[134.95.19.104] schindlm CRAM-MD5 User logged in
Oct 18 09:39:23 lvr13 imap[16751]: seen_db: user schindlm opened
/var/lib/imap/user/U/schindlm.seen
Oct 18 09:39:23 lvr13 imap[16751]: open: user schindlm opened INBOX
Oct 18 09:39:23 lvr13 imap[16751]: SQUAT returned 270 messages
Oct 18 09:40:01 lvr13 imap[16751]: accepted connection
Oct 18 09:40:07 lvr13 imap[16751]: login: lvr6.rrz.uni-koeln.de
[134.95.19.104] gotzhein CRAM-MD5 User logged in
Oct 18 09:42:29 lvr13 imap[16751]: accepted connection
Oct 18 09:43:36 lvr13 imap[16751]: accepted connection
Oct 18 09:46:13 lvr13 imap[16751]: accepted connection
Oct 18 09:51:40 lvr13 imap[16751]: accepted connection
Oct 18 09:59:04 lvr13 imap[16751]: accepted connection
Oct 18 10:04:53 lvr13 imap[16751]: accepted connection
...
This goes on even now. So the imapd with PID 16751 still *answers*
connections, but they all fail somehow. It's pure chance if a given
incoming connection is mapped to such a "broken" process. I've tried to
find out some more (lsof output is edited for clarity).
lsof -p 16751
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
imapd 16751 cyrus cwd DIR 254,33 4096 29261830
/var/spool/imap/U/user/schindlm
imapd 16751 cyrus rtd DIR 8,1 4096 2 /
imapd 16751 cyrus txt REG 8,1 966936 150307
/usr/lib/cyrus-imapd/imapd
imapd 16751 cyrus mem REG 254,25 144 65
/var/lib/imap/annotations.db
imapd 16751 cyrus mem REG 254,25 5198220 31
/var/lib/imap/mailboxes.db
iimapd 16751 cyrus 0u IPv4 274515268 TCP
cyrus.rrz.uni-koeln.de:imap->lvr6.rrz.uni-koeln.de:58346 (ESTABLISHED)
imapd 16751 cyrus 1u IPv4 274515268 TCP
cyrus.rrz.uni-koeln.de:imap->lvr6.rrz.uni-koeln.de:58346 (ESTABLISHED)
imapd 16751 cyrus 2u IPv4 274515268 TCP
cyrus.rrz.uni-koeln.de:imap->lvr6.rrz.uni-koeln.de:58346 (ESTABLISHED)
imapd 16751 cyrus 3w FIFO 0,5 43136 pipe
imapd 16751 cyrus 4u IPv4 43135 TCP
cyrus.rrz.uni-koeln.de:imap (LISTEN)
imapd 16751 cyrus 5u REG 254,25 46 195975
/var/lib/imap/proc/16751
imapd 16751 cyrus 6u REG 254,25 5198220 31
/var/lib/imap/mailboxes.db
imapd 16751 cyrus 7u REG 254,25 144 65
/var/lib/imap/annotations.db
imapd 16751 cyrus 8u REG 254,25 0 2072662
/var/lib/imap/socket/imap-0.lock
imapd 16751 cyrus 9u REG 254,25 716800 63
/var/lib/imap/tls_sessions.db
imapd 16751 cyrus 10u REG 254,25 10485700 97948
/var/lib/imap/db/log.0000000825 (deleted)
imapd 16751 cyrus 13u REG 254,25 6880 2431892
/var/lib/imap/user/U/schindlm.seen
So the 'seen' file of a previous session is still open. It does not seem to
be involved in a deadlock: 'grep 2431892 /proc/locks' doesn't show
anything. I'm not sure how relevant that may be. The TCP connections vary
and obviously *some* kind of exchange takes place with the hapless clients
that connect to this imapd.
[root at lvr13 sasltmp]# strace -t -p 16751
Process 16751 attached - interrupt to quit
14:14:51 select(1, [0], NULL, NULL, {1688, 630000}) = 1 (in [0], left
{1491, 920000})
14:18:08 time(NULL) = 1098188288
14:18:08 read(0, "", 4096) = 0
14:18:08 close(5) = 0
14:18:08 munmap(0xb49ec000, 4096) = 0
14:18:08 unlink("/var/lib/imap/proc/16751") = 0
14:18:08 shutdown(0, 0 /* receive */) = 0
14:18:08 close(0) = 0
14:18:08 shutdown(1, 0 /* receive */) = 0
14:18:08 close(1) = 0
14:18:08 shutdown(2, 0 /* receive */) = 0
14:18:08 close(2) = 0
14:18:08 getpid() = 16751
14:18:08 write(3, "\1\0\0\0oA\0\0", 8) = 8
14:18:08 rt_sigaction(SIGALRM, {0x804ccd0, [], SA_RESTORER|SA_ONESHOT,
0xb729ceb8}, NULL, 8) = 0
14:18:08 alarm(60) = 0
14:18:08 fcntl64(8, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0,
len=0}) = 0
So after a timeout everything is closed and the process waits for the next
connection:
14:19:43 stat64("/usr/lib/cyrus-imapd/imapd", {st_mode=S_IFREG|0755,
st_size=966936, ...}) = 0
14:19:43 accept(4, 0, NULL) = 0
14:19:48 fcntl64(8, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0,
len=0}) = 0
14:19:48 alarm(0) = 0
14:19:48 getpid() = 16751
14:19:48 getpeername(0, {sa_family=AF_INET, sin_port=htons(60136),
sin_addr=inet_addr("134.95.19.104")}, [16]) = 0
14:19:48 getpeername(0, {sa_family=AF_INET, sin_port=htons(60136),
sin_addr=inet_addr("134.95.19.104")}, [16]) = 0
14:19:48 getsockname(0, {sa_family=AF_INET, sin_port=htons(143),
sin_addr=inet_addr("134.95.19.46")}, [16]) = 0
...
14:19:48 select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
14:19:48 write(1, "* OK cyrus.rrz.uni-koeln.de Cyru"..., 80) = 80
14:19:48 time(NULL) = 1098188388
14:19:48 select(1, [0], NULL, NULL, {1800, 0}) = 1 (in [0], left {1800, 0})
14:19:48 time(NULL) = 1098188388
14:19:48 read(0, "00000000 CAPABILITY\r\n", 4096) = 21
14:19:48 select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
14:19:48 write(1, "* CAPABILITY IMAP4 IMAP4rev1 ACL"..., 334) = 334
14:19:48 time(NULL) = 1098188388
14:19:48 select(1, [0], NULL, NULL, {1800, 0}) = 1 (in [0], left {1800, 0})
14:19:48 time(NULL) = 1098188388
14:19:48 read(0, "00000001 AUTHENTICATE CRAM-MD5\r\n", 4096) = 32
14:19:48 time(NULL) = 1098188388
14:19:48 open("/dev/urandom", O_RDONLY) = 11
14:19:48 read(11, "C\333e\357\365\206", 6) = 6
14:19:48 close(11) = 0
14:19:48 gettimeofday({1098188388, 922756}, NULL) = 0
14:19:48 times({tms_utime=231, tms_stime=846, tms_cutime=0, tms_cstime=0})
= 17965478
14:19:48 select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
14:19:48 write(1, "+ PDE0OTcyNzIwOTEuNzY2OTQxM0BjeX"..., 64) = 64
14:19:48 time(NULL) = 1098188388
14:19:48 select(1, [0], NULL, NULL, {1800, 0}) = 1 (in [0], left {1800, 0})
14:19:48 time(NULL) = 1098188388
14:19:48 read(0, "aGVpbmVuaiA3MzFlMzk3OGJmM2U0YjQ2"..., 4096) = 58
...
14:19:50 write(5, "lvr6.rrz.uni-koeln.de [134.95.19"..., 46) = 46
14:19:50 _llseek(5, 0, [46], SEEK_CUR) = 0
14:19:50 ftruncate(5, 46) = 0
14:19:50 time([1098188390]) = 1098188390
14:19:50 getpid() = 16751
14:19:50 rt_sigaction(SIGPIPE, {0xb734d8e0, [], SA_RESTORER, 0xb729ceb8},
{SIG_IGN}, 8) = 0
14:19:50 send(1, "<181>Oct 19 14:19:50 imap[16751]"..., 110, 0) = 110
14:19:50 rt_sigaction(SIGPIPE, {SIG_IGN}, NULL, 8) = 0
14:19:50 write(1, "00000001 OK Success (no protecti"..., 37) = 37
14:19:50 getpid() = 16751
14:19:50 open("/var/lib/imap/log/heinenj/16751", O_WRONLY|O_APPEND|O_CREAT,
0644) = -1 ENOENT (No such file or directory)
14:19:50 open("/var/lib/imap/msg/shutdown", O_RDONLY) = -1 ENOENT (No such
file or directory)
14:19:50 time(NULL) = 1098188390
14:19:50 select(1, [0], NULL, NULL, {1800, 0} <unfinished ...>
At this point it *looks* as though the authentication went fine, but
nothing further happens until the next timeout. I have no idea what to make
of this. I'd be grateful for any pointers.
Greetings, Sebastian Hagedorn
--
Sebastian Hagedorn M.A. - RZKR-R1 (Gebäude 52), Zimmer 18
Zentrum für angewandte Informatik - Universitätsweiter Service RRZK
Universität zu Köln / Cologne University - Tel. +49-221-478-5587
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
Url : https://lists.andrew.cmu.edu/mailman/private/info-cyrus/attachments/20041019/903bd89c/attachment.bin
More information about the Info-cyrus
mailing list