Upgrade 2.1.12 - 2.2.10 weirdness (subscribe -> cpu bound)
Gilles Bruno
Gilles.Bruno at ujf-grenoble.fr
Fri Dec 3 14:00:16 EST 2004
Hi,
sorry for this long mail ... and my poor english
since the announce of (CAN-2004-1015), we're (slowly/cautiously) upgrading
our cyrus-imap servers : everythink went fine went we upgraded cyrus 2.1.15
and 2.2.8 to 2.2.10, but we've got problems upgrading or major server,
currently running cyrus imapd 2.1.12.
** What we did :
This server runs Cyrus 2.1.12 + Sasl 2.1.12 + db3 and hosts ~ 3000 users,
140 Go (Raid 5) e-mail (about 20.000.000 mails). We decided "prudently " to
migrate its content to another cyrus 2.2.10 server - the "old" 2.1.12 server
is *still* untouched.
We backed up its mailspool, mailboxes.db (flat), user.seen & user.sub files
to a freshly installed FreeBSD 4 stable + db4 (db41-4.1.25) + sasl 2.1.18 +
cyrus 2.2.10.
For it's backend db we use the following settings :
annotation_db: skiplist (unused on 2.1.12)
duplicate_db: berkeley-nosync (DB3 on 2.1.12)
mboxlist_db: flat (unchanged)
ptscache_db: berkeley (unused on 2.1.12)
quota_db: flat (unused - no quotas)
seenstate_db: flat (unchanged)
subscription_db: flat (unchanged)
tlscache_db: berkeley-nosync (unused - no tls)
(we use the *same* DB backends on our old 2.1.12 server)
We reconstructed its mailspool twice (su cyrus -c /usr/cyrus/bin/reconstruct
-rf user), ran chk_cyrus : flawlessly
** What then happened :
So far, we had no visible problem - the user can happily use their
mailboxes, neither seen states or ACL have been lost
*But* when a single user want to modify it subscription (using
mozilla/thunderbird: "Files" -> "Subscribe...") the imapd process take
"ages" ~ 20s, but worse this imapd eats ~ 80% CPU on a dual Xeon 2.8/1Go !!
About 20s after (even if the user has only 20 mbox), it gives the right
list, but we're really freightened when there will be ~ 500 simultaneous
users :/
... furthermore, we run exactly the same cyrus imapd binaries on the same
hardware (Dell Pe2650), OS (FreeBSD 4 stable) on another server (succesfully
upgraded from 2.2.8 to 2.2.10) without any problem - the subscribe dialog
appears without delay/ CPu "plateau"
We ktraced the imapd process on the 2 servers without any diffs (minus delays)
** So far our conclusions :
. it's not an I/O issue - no activity on the dedicated raid/AHC
39160/partition - quite dead iostat stats
. neither ctl_cyrusdb -r nor chk_cyrus complain - no "suspicious" log
. it's not likely a user.sub DB problem - we tried converting user.sub to
skiplist, DB and even recreated them without any success
---------------- sample dialog (mbox names obscured/removed) :
"27 lsub "" "INBOX.*"\r
(snip)
27 OK Completed (0.000 secs 26 calls)\r
"28 list "" "INBOX.%"\r
(snip)
28 OK Completed (0.008 secs 31 calls)\r
"29 list "" "INBOX.%.%"\r
(snip)
29 OK Completed (0.016 secs 18 calls)\r
"30 lsub "" "user.*"\r
(snip)
"30 OK Completed (0.000 secs 1 calls)\r
"31 list "" "user.%"\r
(snip)
"31 OK Completed (6.227 secs 1 calls)\r <- ###### THIS ONE
"32 list "" "user.%.%"\r
(snip)
"32 OK Completed (6.375 secs 1 calls)\r <- ###### THIS ONE
"
"33 lsub "" "*"\r
(snip)
33 OK Completed (0.008 secs 28 calls)\r
"34 list "" "%"\r
(snip)
"* LIST (\\HasChildren) "." "INBOX"\r
* LIST (\\HasChildren) "." "XXX"\r (shared mb)
34 OK Completed (6.305 secs 34 calls)\r <- ###### THIS ONE
"35 list "" "%.%"\r
"* LIST (\\HasChildren) "." "INBOX.XXX"\r
* LIST (\\HasNoChildren) "." "INBOX.YYY"\r
(snip)
* LIST (\\HasNoChildren) "." "crip-visio.gdfgdfg"\r (shared mb)
35 OK Completed (6.492 secs 32 calls)\r <- ###### THIS ONE
"
"36 IDLE\r
"36 OK Completed\r
"
"37 close\r
38 logout\r
----------------
. the ktrace shows (of course) many calls to the mailboxes.db file -
apparently, the ( LIST "" "user.%" ) commands take ages to completed (we've
got 127000 lines in our mailboxes.db flat file) - but when we run under
cyradm a single "listmailboxes %" or "listmailboxes %.%" it completes at
normal speed...
=-=-=-=
Could any "gurus" out there enlighten us : we're running out of candle for
our voodoo cults... and of course (thanx Mr. Murphy) we've got to migrate
quickly - our of campus imap acces is blocked since Wed. 25/11
If it is mailboxes.db related (???) would a single reconstruct -rf from an
empty mbox.db file help (but we ran it twice - there were no diffs) ??
Thanks for your patience for this long mail,
best regards
Gilles BRUNO
System Admin
University Joseph Fourier - France
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
More information about the Info-cyrus
mailing list