DBERROR: skiplist recovery errors
Lawrence Greenfield
leg+ at andrew.cmu.edu
Thu Dec 12 18:16:03 EST 2002
From: "Rob Mueller" <robm at fastmail.fm>
Date: Thu, 12 Dec 2002 13:01:48 +1100
Looks like you've got corrupted skiplist files. Delete the seen state
databases with the problem and it will automatically rebuild them.
That's what we do.
It's scarier when you see this on the mailboxes DB.
You've had this problem on your mailboxes db? Yuck.
Which reminds me. The skiplist recovery code barfs if it comes across this
error and gives up. However, if you truncate the length of the file to just
smaller than the problem location, then it happily recovers, even though
there's an incomplete record. I really don't think recovery should ever just
'give up'. Maybe make a backup of the bad DB, remove the offending records,
warn the user and continue would be much nicer.
That's a good idea; could you file it as a bug?
Doing this could destroy most of the database and could be even more
confusing to system administrators. I guess reasonable syslog()
records would help with that.
Also there's a bit of code that looks like this:
for (;;) {
p += RECSIZE(p);
if (p >= q) break;
if (TYPE(p) == COMMIT) break;
}
I had a corrupted DB once where RESIZE(p) == 0. This just went into an
infinite loop. Also not a good idea when trying to recover a database.
Probably verifying that RECSIZE > 0 would be good. Though it looks
like RECSIZE == 0 implies that the TYPE of the record isn't valid,
either.
Hmm.
Larry
More information about the Info-cyrus
mailing list