CYRUS_SKIPLIST_UNSAFE

Lawrence Greenfield leg+ at andrew.cmu.edu
Wed Oct 23 17:04:54 EDT 2002


   From: "Rob Mueller" <robm at fastmail.fm>
   Date: Wed, 23 Oct 2002 15:30:09 +1000
[...]
   So this lets you turn off the use of fsync() calls altogher, but is clearly
   regarded as "unsafe". I was wondering however, how "unsafe" it would be to
   remove the calls from the commit() code, which I'm guessing is called the
   most, but leave them in the checkpoint() and recovery() code?

   Doesn't really "unsafe" in this instance mean "Unsafe if the system
   crashes", but it's not really unsafe if the program itself crashes, because
   any changes made will still be written back eventually by the OS? What if
   the OS crashes, will it still be able to recover back to the last checkpoint
   if they use fsync() calls?

That's correct, "unsafe" means "unsafe if the system crashes". (The
main use of CYRUS_SKIPLIST_UNSAFE is to make cvt_cyrusdb faster.)

But "checkpoints" don't happen every N seconds---they happen when the
skiplist file has reached a certain size (due to so much write volume)
and merely serve to keep the size of the file down (skiplist files can
grow to twice the size they "should" be).

Recovery isn't guaranteed to succeed (or necessarily be sane) when the
fsync()s are off. The fsync()s in commit() force an ordering---they
make sure the data is on disk before a 4-byte COMMIT record is
written. 

I believe the 2nd fsync() could be omitted and retain ACI properties
(just losing durability---definitely ok for seen state). I'd have to
think more carefully before guaranteeing it.

Losing the first fsync() would compromise consistency and integrity
since, on a crash, the 4-byte COMMIT record might be written before
the data, causing recovery to include the now-bogus data in the live
file.

Larry





More information about the Info-cyrus mailing list