annotations.db corruption

Bernhard Reiter bernhard at intevation.de
Fri Apr 7 09:43:04 EDT 2006


Hello *,

we are currently debugging a problem with the Kolab Groupware server
<http://www.kolab.org/>.  Cyrus imapd is a important core component of
the Kolab server and Kolab makes heavy use of mailbox annotations.

We are using skiplist as backend for the annotations.db and some users
experienced corruptions of the annotations.db.

These problems always follow the same pattern:

1. Something goes wrong (this is the most interesting part as we
   don't know by now _what_ exactly goes wrong.)

2. A notice on a partial transaction is written to the log: 
   imap[16722]: skiplist recovery /kolab/var/imapd/annotations.db: found partial txn, not replaying

3. When trying to restart imapd it fails and writes to the log:
   imap[16753]: DBERROR: skiplist recovery /kolab/var/imapd/annotations.db: A9FC8 should be ADD or DELETE

The actual defect in the annotations.db skiplist file always follows
a constant pattern too:

In the "log" part of the skiplist file appears a bunch of nullbytes,
exactly between two valid transaction, with roughly the length of an
ADD node:

    0012e910: 7365 7276 6572 2064 6965 0074 6578 742f
    0012e920: 706c 6169 6e00 4424 2e1e 0000 0012 e9c4
    0012e930: 0013 47f4 0013 496c 0013 4ce8 ffff ffff
                                            ^^^^^^^^^
                               regular end of ADD entry
    0012e940: 0000 00ff 0000 0004 0011 79b4 0000 00ff
              ^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^
              COMMIT    DELETE entry        COMMIT
    0012e950: 0000 0000 0000 0000 0000 0000 0000 0000
    0012e960: 0000 0000 0000 0000 0000 0000 0000 0000
    0012e970: 0000 0000 0000 0000 0000 0000 0000 0000
    0012e980: 0000 0000 0000 0000 0000 0000 0000 0000
    0012e990: 0000 0000 0000 0000 0000 0000 0000 0000
    0012e9a0: 0000 0000 0000 0000 0000 0000 0000 0000
    0012e9b0: 0000 0000 0000 0000 0000 0000 0000 0000
    0012e9c0: 0000 0000 0000 0002 0000 0032 706f 6c79
                        ^^^^^^^^^
    Regular start of ADD entry

Issue 840 in the official Kolab bug tracker contains some more
in-depth information on the problem and our analysis so far:
https://intevation.de/roundup/kolab/issue840

Has anyone here on the list a idea what the reason of this could
possibly be?  Any hints on how to continue debugging and any sound
theory on the background of this problem would be highly appreciated.

One more catch: we have found no way so fare to reproduce this
problem.  But we have customers how experience it on a more or less
regular basis (about once a month).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://lists.andrew.cmu.edu/mailman/private/cyrus-devel/attachments/20060407/208c180e/attachment.bin


More information about the Cyrus-devel mailing list