FastMail Cyrus Patches for upstream

Wed Aug 26 21:42:14 EDT 2009

On Wed, Aug 26, 2009 at 02:47:29PM -0400, Brian Awood wrote:
> Lately we've been chasing cache file issues here pretty regularly and 
> we just found something yesterday that might indicate what's 
> occurring.  While I like the idea of keeping a checksum of the 
> records, it probably won't solve the issues we are seeing.  

No, probably not!

> It seems to probably relate to delayed expunge, the cyr_expire and 
> unexpunge processes.  After an unexpunge, some cache records end up 
> with a negative offset from the previous message.  E.g, the cache 
> record for message 110 may have a location earlier in the file than 
> the record for message 109.  Cyrus tries to calculate the length of 
> record 109, gets a negative size, then hilarity and lots of error 
> logs ensue. 

I have a fix for that which avoids the hilarity.  It's in that queue,
and it makes all cache accesses go through one API which does bounds
checking.

> Initially, it seems like index might be corrupted with a bogus offset 
> for message 110, which just happens to be within range, but it isn't.  
> The record it points to is valid and matches the message on disk.  So 
> it seems like there's just a bug somewhere causing the cache file to 
> be written out of order, but I think there are possibly other issues 
> being caused by calculating the cache record length each time it's 
> needed.  One example might be that we periodically find a cache file 
> that is unnecessarily large for the number of messages that are 
> present (sometimes to the point of not being able to mmap it).  

No - the calculation of record length each time it's needed is only a
problem when the pointer into the cache file is garbage. In which case
getting the correct length bit of garbage is still going to ruin your
day later on when it tries to parse the internal structure of the cache
record and gets all confused.

The only thing that makes it work OK is pointing to the correct location.

There's a bug in process_records as used by ipurge even in the current
release (I think - I pushed the fix to CVS a while back), whereby it used
an incompatible set of flags.  Basically you can't do an immediate delete
EVER when delayed delete is present, or the cache offsets get screwed.
The easy fix was to make ipurge a delayed delete.  I'm not sure that
unexpunge is fixed at all... because I don't use it.  But I think it's
OK now.

> It seems like all these would be non-issues if the length of each 
> cache record was stored along with the offset in the index file 
> rather than trying to calculate it based on the offsets.  Does anyone 
> know if there a reason the cache record length isn't stored along 
> with the offset?  

No, they'd still be issues.  The correct fix is to ensure there's no
code path where legitimate use creates invalid cache pointers.  The
CRC checking is orthagonal to that except that it will make it even
more clear when (if, say if) we mess it up.

Bron.