Cyrus 2.3.8 imapd process periodically sticks at 100% CPU

Bron Gondwana brong at fastmail.fm
Mon Sep 29 08:41:31 EDT 2008


On Mon, Sep 29, 2008 at 12:37:12PM +0100, Mark Cave-Ayland wrote:
> Hi there,
> 
> I'm experiencing a problem with Cyrus 2.3.8 interacting with an Outlook 
> client and was hoping this would be the right place to get some advice.
> 
> What happens is that periodically (maybe around once a month?) we have 
> one particular user who contacts us complaining that they are unable 
> access their mailbox. Generally we always find the same thing: there is 
> an imapd process accessing his seen DB which is running at 100% CPU. 
> Once this process is killed then things go back to normal and the user 
> can log in.
> 
> The latest report we had of this problem happening again was this 
> morning, and fortunately I was in a position to attack it with gdb and a 
> file of debug symbols. This showed that the process in question was 
> getting stuck in a loop in index_expungeuidlist(). I've uploaded the 
> transcript of my gdb session to 
> http://pastebin.siriusit.co.uk/cyrus-imapd-gdb.txt for people who are 
> familiar with cyrus internals.
> 
> The short story appears to be that newseenuids (new) points to an empty 
> string ('\0') and so the code gets stuck because of the following at 
> line 532 of imap/index.c in index_checkseen():
> 
> oldseen = (*old == ':');

oldseen = 0;
 
> Since *old is an empty string, oldseen will always be 0, and so the 
> while() loop never exits. Unfortunately this is the first time I've ever 
> looked at cyrus internals, so am not really sure what the seen list 
> should look like normally.

No, BUT.

while (oldnext <= uid) {
  ...
  if (!*old) oldnext = mailbox->last_uid+1;
}

if your mailbox is corrupted such that last_uid is less than
an actual uid in the mailbox, then you will get an infinite
loop here.

> The confusing thing is that we have been using these packages for 
> several clients and this is the *only* particular server and the *only* 
> user on this server experiencing this problem. The one thing we have 
> noticed is that this particular user has a larger mailbox compared to 
> the other users (~1GB) but then it doesn't seem so large as if it would 
> cause any problems.

Yeah, it's a corrupted mailbox.
 
> Finally, one more thing to add is that we have already gone through the 
> steps of rebuilding the seen DB skiplist using the skiplist.py script 
> several times when this has happened in the past, and it has made no 
> difference.

No, it won't.  You need to fix the mailbox or patch the code to not be
put into an infinite loop by a bogus index file.

The attached patch might do the trick for you.  I just slapped it
together on spec.  It compiles, that's about all I can offer about
it :)

Bron ( and no, I can't spell.  Tough ;) )
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cyrus-inifiniteloop-2.3.12.diff
Type: text/x-diff
Size: 911 bytes
Desc: not available
Url : http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20080929/f1fa6cd4/attachment.bin 


More information about the Info-cyrus mailing list