FastMail Cyrus Patches for upstream
murch at andrew.cmu.edu
Tue Aug 25 07:06:21 EDT 2009
The CRC additions make sense to me. I will look at the rest of the
patches when I get back from vacation next week.
Bron Gondwana wrote:
> UP-FRONT notice. I'd particulary love feedback on the index
> format change. Here's the executive overview:
> index header: replace SPARE4 with HEADER_CRC - a CRC32 of
> the rest of the header.
> index record: add two additional 32bit values, CACHE_CRC
> and RECORD_CRC. CACHE_CRC is a crc32 of the entire cache
> record, and RECORD_CRC is a crc32 of the entire index
> record (including CACHE_CRC) - providing integrity checking
> all the way through.
> Total additional cost: 8 bytes per message plus some CPU time
> creating and checking the CRCs. Benefit - immediate index
> corruption detection. I think this is a good thing - in
> theory the underlying layers should be providing perfect
> abstractions, but in practice a memory error, disk error or
> even eratic cable can cause transient failures - and if we
> write those incorrect values back to the file they last
> Ok - onto the main show!
> I've got a small pile of patches for upstream... some quite
> old and heavily tested, and a couple because I want to grab
> dibs on index minor_version 11 before someone else claims it
> and makes a total mess of our patch management!
> OK - here we go ( sorry about the long URLS - you can just go
> to http://cyrus.brong.fastmail.fm/ and follow the links, or of
> course hit github at http://github.com/brong/cyrus-imapd/ )
> Use a struct of individual cache items rather than macros,
> allows sanity checks on the cache record to detect corruption
> and avoid crashing!
> NOTE: I'd love to do this with index records as well, but it's
> an awful lot of work. I'll be doing that slowly as time permits.
> NOTE - this is a huge patchset, and it's had an enormous amount of
> work done on it! This completely changes the charset encoding
> pathways within Cyrus. It gives unicode 5.1 support, a bunch of
> new charactersets, and full utf-8 support in sieve scripts. It
> also allows search with whitespace to work by compressing
> whitespace to a single space rather than removing it entirely.
> This is the new one! It's not entirely complete in its behaviour yet,
> it only syslogs for issues, and it's not syslogging on all paths that
> read the index and cache records yet. I'll work on adding those over
> time. I _believe_ it's creating crc32s on all paths that modify the
> record, which is the important thing!
> I've chosen to only implement crc32 where zlib is available,
> putting stubs that return 0 otherwise. It would be easy
> enough to copy in the public domain crc32 code that's out there
> if we want to support everyone.
> I've also chosen to ignore a cache_crc of zero, so that we can upgrade
> indexes without a huge IO hit as we read the entire cache to find the
> initial values. It means one in 2^32 records won't have integrity
> protection. I can live with that.
> Comments please. Once I've dumped this stuff in CVS I'd love to cut
> another release with all the cool new features so other people can use
> them. The charset support in particular is a nice user-visible thing
> that fixes a bunch of bugzilla bugs and makes fixing others a lot
Carnegie Mellon University
More information about the Cyrus-devel