FastMail.FM patchset - new patches

Thu Mar 15 08:25:23 EST 2007

On Thu, 15 Mar 2007 13:09:04 +0000 (GMT), "David Carter" <dpc22 at cam.ac.uk> said:
> On Thu, 15 Mar 2007, Rob Mueller wrote:
> 
> > May not be true, but:
> >
> >>   Is it safe? - we calulated that with one billion messages you have a one
> >>   in 1 billion chance of a birthday collision (two random messages with
> >>   the same UUID).
> >
> > Is true.
> 
> Fair enough.
> 
> With hindsight I should probably have defined message UUIDs to be the
> full 
> MD5 hash: 128 bits isn't that much worse than 96 bits per message. What
> is 
> the CPU overhead like for calculating MD5 sums for everything on the fly?

Honestly, we don't even notice it in the noise, especially since IO is the
main limiting factor on these machines.  Also, you only have to do it once
per message, at delivery time.  I'd be tempted to write an RFC for providing
both the MD5 and SHA1 hash via IMAP, and caching them both in the cyrus.cache
if not the cyrus.index.  Would make client clean-up-after-inconsistency
handling, and backups for that matter, much cleaner.

> UUIDs started out life as Mailbox UniqueID (64 bits) plus Message UID (32 
> bits), hence the size and rather unfortunate name. The hash algorithmn 
> used to generate mailbox uniqueIDs is a bit basic, which is why I
> switched 
> to generating them on the fly from master.

Sure.  The UUID code looks really bolted on, which I guess it is.
lib/message_uuid* are nice, but the master integration and pass-by-env
and stuff is pretty messy!  Really, we have already proved that we get by
fine without them (given how many were all zero in our system already!)

Oh - by the way, don't go rolling out all our patches all at once then
reconstructing your mailboxes to get new UUIDs, you'll find UUID mismatches
across your replication system really fast!  I'm going cleaning that up
now :(

Bron.
-- 
  Bron Gondwana
  brong at fastmail.fm