DBERROR: skiplist recovery errors

Lawrence Greenfield leg+ at andrew.cmu.edu
Thu Dec 12 19:00:51 EST 2002


   From: "Rob Mueller" <robm at fastmail.fm>
   Date: Fri, 13 Dec 2002 10:48:20 +1100

   >    It's scarier when you see this on the mailboxes DB.
   >
   > You've had this problem on your mailboxes db? Yuck.

   Yes, after system crashes. I think it was a SCSI card/driver problem.

Hmm. Well, this is not as surprising. The intention is that the
skiplist format is suppose to have ACID properties when used with
stable storage, I decided when I was designing it to simplify the
format in many cases by reducing it's resilience to byzantine
failures.

[Sidebar: skiplist makes an assumption that a single byte write might
be in an interdeterminate state after a crash, that write will not
affect nearby data that is already known to be on stable storage. On
reflection I suspect the loop-forever problem could be caused by a
"normal" failure, and in such a case the "truncate file" option is
actually the right thing to do.]

But since we've seen this problem on hardware we believe to be good
(and that didn't even suffer an unexpected shutdown) it's my belief
that there is a bug lurking out there. I just don't know what it is.

Larry





More information about the Info-cyrus mailing list