Duplicate delivery DB (was Unexpected database recovery)

Richard Gilbert R.Gilbert at sheffield.ac.uk
Wed Dec 3 14:05:42 EST 2003


> > ... I applied John Wade's lock_flock patch to the version of Cyrus
> > impad we were already running, i.e. 2.1.14 and rebuilt and reinstalled.
> > cyrus-imapd was restarted at 5 am this morning to minimise inconvenience
> > to users.  I was surprised to find that the system was unavailable until
> > about 08:39 because of database recovery.
> >
> > Nov 19 05:00:11 impala master[9697]: [...] process started
> > Nov 19 05:00:11 impala ctl_cyrusdb[9698]: [...] recovering cyrus databases
> > Nov 19 05:05:10 impala ctl_mboxlist[10854]: [...] skiplist: recovered
> > 	/var/imap/mailboxes.db (61786 records, 4909724 bytes) in 9 seconds
> > Nov 19 08:38:54 impala ctl_cyrusdb[9698]: [...] done recovering cyrus databases
> > Nov 19 08:38:54 impala master[9697]: [...] ready for work
> > Nov 19 08:38:54 impala ctl_cyrusdb[22419]: [...] checkpointing cyrus databases
>
> The lock_flock patch has serious performance implications (namely, if you
> don't get a lock on the first try, you have to wait an entire second to
> try again), and given that this happened just after you changed the
> locking mechanism, it seems suspicious.
>
> However, I can't think what would be causing the recovery process to lose
> at getting the locks it needs, so (nothing else should be running at
> that time)....

My logs don't go back any further (without hassling the sys admins) but
the database recovery took ~90 mins two days before the lock_flock patch
was applied and then ~3h 38m immediately after it was applied:

Nov 17 05:00:10 impala master[19332]: [ID 965400 local6.notice] process started
Nov 17 06:30:51 impala master[19332]: [ID 139525 local6.notice] ready for work
Nov 19 05:00:11 impala master[9697]: [ID 965400 local6.notice] process started
Nov 19 08:38:54 impala master[9697]: [ID 139525 local6.notice] ready for work

I then reduced the -E paramater to ctl_deliver from 3 to 1 and at the next
restart a few days later, still running with the lock_flock patch, it
took 20 seconds and at the next restart 22 seconds:

Nov 23 04:00:10 impala master[1871]: [ID 965400 local6.notice] process started
Nov 23 04:00:30 impala master[1871]: [ID 139525 local6.notice] ready for work
Nov 28 04:00:10 impala master[22770]: [ID 965400 local6.notice] process started
Nov 28 04:00:32 impala master[22770]: [ID 139525 local6.notice] ready for work

The reason why I am restarting the server is to deal with the odd
"IOERROR: reading message: unexpected end of file" errors, which I am
still getting, which first led me to the lock_flock patch (archive message
18705).

Today for the first time I am seeing the constantly repeated message
"DBERROR: mydelete: error deleting <3FBCB29C.19519.D99C08 at localhost>:
DB_NOTFOUND: No matching key/data pair found".  I haven't seen much on the
list about this so I will be stopping the server in the early hours and
deleting the duplicate.db before restarting.  (tom at bryntez.com said he ran
reconstruct -f after deleting the database but reconstruct doesn't seem to
be relevant to the duplicate delivery database so I won't bother.)

The duplicate delivery database (Berkeley DB3) seems to be at the root of
the few problems I have with Cyrus and yet I get the impression from the
discussions on this list that disabling duplicatesuppression won't
actually make things any better because the database is still maintained.

Thank you in anticipation of any advice or comments.

Richard
--
Richard Gilbert
Corporate Information and Computing Services
University of Sheffield, Sheffield, S10 2TN, UK
Phone: +44 114 222 3028   Fax: +44 114 222 3040




More information about the Info-cyrus mailing list