Elusive replication bug in 2.3.13

Rob Mueller robm at fastmail.fm
Tue Jan 13 20:25:02 EST 2009


> Since I upgraded to 2.3.13 (Invoca RPM rev 4) I've been running into a
> mysterious replication bug. In some circumstances, creating a user with a 
> three
> letter long username causes the sync master process to choke, on either 
> signal
> 11 or 6. Like this:

Interestingly, we just encountered this as well after a new rebuild from 
CVS.

I got a stack back trace, which showed realloc() was choking. I looked at 
the code around there, and couldn't see anything obviously wrong. Bron 
looked at it and saw the problem in 10 seconds. I think I've been out of 
practice with C programming for too long :)

The bug occured in this commit where Ken changed a glob() call to a custom 
readdir() loop.

http://github.com/brong/cyrus-imapd/commit/56d0823d3192ea13fa3afe397d625252b16252b9

There's a missing sizeof() in there meaning that if you have > 25 quotas in 
a dir (or > 12 on a 64-bit build), you'll have memory corruption occuring.

Here's the patch to fix:

http://github.com/brong/cyrus-imapd/commit/1de9d758aeb360714236388c4e1689db0522c21e

All the fastmail patches are now on github, you can see them here:

http://github.com/brong/cyrus-imapd/commits/fastmail/
http://github.com/brong/cyrus-sieve/commits/fastmail/

Rob

FYI: We haven't changed to skiplist for quota db's because they get a lot of 
writes, and a skiplist db requires locking the entire db for the period of 
the write, so I'm worried about contention issues. To be honest, I haven't 
tested, so it might not be a problem at all, it's just something we'd need 
to test first.



More information about the Info-cyrus mailing list