skiplist_unsafe

David Carter dpc22 at cam.ac.uk
Tue Dec 11 05:34:13 EST 2007


On Fri, 7 Dec 2007, Janne Peltonen wrote:

>> If you feel that your filesystem/buffercache will do a good job at
>> writing things out to disk, and you've got battery-backed cache on
>> your storage, you should be relatively well off.
>
> But if I were to turn skiplist_unsafe on, and the OS crashed - or, say, 
> the cluster system forcibly unmounted my Cyrus spool and config 
> filesystems - wouldn't that result in horribly unrecoverable databases 
> all over the place? (I have everything in skiplist, except quota and 
> subscriptions.)

It is easy enough to find out.

Take an fsync() test rig such as Brad Fitzpatrick's diskchecker.pl and 
comment out the fsync()s. If the disk checker moans, then updates have 
been lost in buffer cache. Under Linux this is only safe if the filesystem 
is mounted with the "sync" option, even with data=journal.

Part of the point of fsync() is to make sure updates hit nonvolatile 
storage in the correct order. A specific example: skiplist commit records 
are written after an fsync(), immediately followed by another fsync() 
before the write lock is released. If writes get reordered before they hit 
disk, then there is a good chance that the database will become corrupt.

-- 
David Carter                             Email: David.Carter at ucs.cam.ac.uk
University Computing Service,            Phone: (01223) 334502
New Museums Site, Pembroke Street,       Fax:   (01223) 334679
Cambridge UK. CB2 3QH.


More information about the Info-cyrus mailing list