masters signals cyr_expire to death by 11

Bron Gondwana brong at fastmail.fm
Tue Sep 4 23:40:50 EDT 2007


On Tue, Sep 04, 2007 at 02:28:35PM -0400, Wesley Craig wrote:
> Just to be clear, that's cyr_expire dying, not being killed by  
> master.  Signal 11 is SIGSEGV, which is to say a segmentation  
> violation, i.e., a bug in cyr_expire.  Personally, I've seen this  
> error on a lot of systems.  The mailbox after the last one you listed  
> probably has a corrupt meta file -- indicative of a bug in another  
> process.  Reconstruct will typically correct these sorts of problems.

http://cyrus.brong.fastmail.fm/patches/cyrus-expire-prefix-2.3.9.diff

Basically, the cyrus.cache file has a bunch of field length data in it,
and cyr_expire walks through that.  If you have a corrupt cyrus.cache
then it will calculate the next memory address outside the mmap'ed
file and *BOOM* - segfault.

The patch above does range checking on the next calculated address each
time, and if it's outside the range, it logs the mailbox and return
IOERROR.  You still need to reconstruct, but at least cyr_expire keeps
running and you have the exact mailbox with the error in your logs
(otherwise you only have the last mailbox with messages expunged, which
gives a range to check but not an exact target)

(it also adds a -p "mailbox prefix" option to allow you to expire just
a small range of mailboxes for easy finding of your bogus folder again)

Regards,

Bron.


More information about the Info-cyrus mailing list