BDB and errors...

Tue Mar 14 16:48:00 EST 2006

> We're using cyrus 2.3 and everything works fine, except we seem to have
> intermittent problems with BDB 4.2 (specifically the RPM db4-4.2.52-3.1).
> We
> only use BDB for the delivery db.
>
> In general it works fine, however if for some reason a server has crashed
> and we reboot the server, we then seem to almost always have a problem
> with
> the DB.
>
> Probably best to show a sequence of events.
>
> 1. Server froze up, so force a hard reset
> 2. Server boots up and remounts everything fine. All partitions are
> reiserfs
> and mount ok with journal playback
> 3. We start cyrus. Since the delivery DB is temporary and non-critical,
> the
> start script explicitly does:
>
>      rm -f /var/imap/db/log.*
>      rm -f /var/imap/db/__db*
>      rm -f /var/imap/deliver.db
>
> To clean out all existing BDB state and information. I can confirm that
> the
> only files left in the /var/imap/db dir are DB_CONFIG and skipstamp. There
> appears to be no BDB environment state
> 4. cyrus appears to start fine, but intermittently we see errors in the
> log
> like:
>
> Mar 14 13:47:25 server1 lmtp[2514]: DBERROR: mystore: error storing
> <441702FE.6070601 at googlemail.com>: DB_PAGE_NOTFOUND: Requested page not
> found
>
> Each time an error like this occurs, it seems to leave a transaction open.
> Running:
>
> (cd /var/imap/db; /usr/bin/db_stat -t -h .)
>
> Normally shows "Active transactions" as 0, but after each of the above
> errors appears in the log, the count increases and never decreases.
> Eventually this causes problems because it appears that processes get
> stuck
> waiting for the transaction in a semi-busy loop inside BDB (continuous
> calls
> to select with a 1/10th of second timeout), and the checkpointing process
> can't cleanup old log files with open transactions in them. Eventually
> either the transaction count reaches the set_tx_max value, and causes BDB
> to
> go into error status, or the server load increases a lot due to the
> semi-busy wait loop BDB gets in.
>
> 5. Stopping cyrus, then starting it again with the exact same start script
> usually then fixes the problem
>
> That's the bit I don't get. Why would restarting again change anything, it
> seems that we're clearing out exactly the same data in each case, but
> there's definitely some weird state getting left behind after a hard
> reboot
> causing the errors, but I don't know where or why.
>
> Has anyone seen anything similar with their servers or has any idea what
> would be causing this?

I have no real idea what could cause this but I have the following
sequence in my db conversion script which is used by the init script in my
rpms. The procedure is the best according do lots of my tests using
different version of db3 and db4 with cyrus-imapd. As you can see I first
try a db_checkpoint, then kill it if it seems to hang, then do a
db_recover and only after this do a rm -vf $imap_prefix/db/log.*
$imap_prefix/db/__db.*. I just tried to find out the safest procedure
after simulated crashes, without really understanding BDB and why people
like to use it so much. I don't, and my servers run fine without any BDB.

Simon

-%<-------------------------------
# make sure our Berkeley databases are in a sane state
# wait for db_checkpoint to end successfully or kill it after a timeout
db_checkpoint -v -1 -h $imap_prefix/db &
DB_CHECK_PID=$!
CNT=0
while [ $CNT -lt 60 ]; do
  if ! kill -0 $DB_CHECK_PID > /dev/null 2>&1; then
    break
  fi
  sleep 1
  let CNT+=1
done
if kill -0 $DB_CHECK_PID > /dev/null 2>&1; then
  kill -USR1 $DB_CHECK_PID > /dev/null 2>&1
  sleep 1
  kill -KILL $DB_CHECK_PID > /dev/null 2>&1
  wait $DB_CHECK_PID > /dev/null 2>&1
fi

# do a normal recovery
db_recover -v -h $imap_prefix/db
RETVAL=$?
if [ $RETVAL -ne 0 ]; then
  # try a catastrophic recovery instead of normal recovery
  db_recover -v -c -h $imap_prefix/db
  RETVAL=$?
  ERRVAL=$[ $ERRVAL + $RETVAL ]
  if [ $RETVAL -ne 0 ]; then
    echo "ERROR: catastrophic recovery of Berkeley databases failed"
  fi
fi

if [ "$EXPORT" = "export" ]; then
  # convert all db files to skiplist for migration
  # TODO: quota_db, we don't touch it for now
  cvt_file $imap_prefix/deliver.db           "skiplist"
  cvt_file $imap_prefix/mailboxes.db         "skiplist"
  cvt_file $imap_prefix/tls_sessions.db      "skiplist"
  cvt_file $imap_prefix/annotations.db       "skiplist"
  cvt_file $imap_prefix/ptclient/ptscache.db "skiplist"
  rm -vf $imap_prefix/db/log.*
  rm -vf $imap_prefix/db/__db.*
else
-%<-------------------------------