Elusive replication bug in 2.3.13
Rob Mueller
robm at fastmail.fm
Tue Jan 13 20:25:02 EST 2009
> Since I upgraded to 2.3.13 (Invoca RPM rev 4) I've been running into a
> mysterious replication bug. In some circumstances, creating a user with a
> three
> letter long username causes the sync master process to choke, on either
> signal
> 11 or 6. Like this:
Interestingly, we just encountered this as well after a new rebuild from
CVS.
I got a stack back trace, which showed realloc() was choking. I looked at
the code around there, and couldn't see anything obviously wrong. Bron
looked at it and saw the problem in 10 seconds. I think I've been out of
practice with C programming for too long :)
The bug occured in this commit where Ken changed a glob() call to a custom
readdir() loop.
http://github.com/brong/cyrus-imapd/commit/56d0823d3192ea13fa3afe397d625252b16252b9
There's a missing sizeof() in there meaning that if you have > 25 quotas in
a dir (or > 12 on a 64-bit build), you'll have memory corruption occuring.
Here's the patch to fix:
http://github.com/brong/cyrus-imapd/commit/1de9d758aeb360714236388c4e1689db0522c21e
All the fastmail patches are now on github, you can see them here:
http://github.com/brong/cyrus-imapd/commits/fastmail/
http://github.com/brong/cyrus-sieve/commits/fastmail/
Rob
FYI: We haven't changed to skiplist for quota db's because they get a lot of
writes, and a skiplist db requires locking the entire db for the period of
the write, so I'm worried about contention issues. To be honest, I haven't
tested, so it might not be a problem at all, it's just something we'd need
to test first.
More information about the Info-cyrus
mailing list