What happened to my db/ ?

Simon Matter simon.matter at ch.sauter-bc.com
Wed Jan 14 12:11:02 EST 2004


>>>>> Hello All,
>>>>>   I recently moved my cyrus-imapd install from one machine to
>>>>> annother.
>>>>> The move went *quite* well, I had both servers stoped and rsynced the
>>>>> live server's data (/var/imap, /var/spool/imap) over to the new
>>>>> server.
>>>>>
>>>>> Everything was fine untill I started playing with SSL, at that point
>>>>> I
>>>>> started having problems getting cyrus running after a restart.
>>>>>
>>>>> Here is what I was seeing in my logs:
>>>>> """
>>>>> Jan  8 17:27:11 [tls_prune] DBERROR db4: operation not permitted
>>>>> during
>>>>> recovery.
>>>>> Jan  8 17:27:11 [tls_prune] DBERROR: opening
>>>>> /var/imap/tls_sessions.db:
>>>>> Invalid argument
>>>>> Jan  8 17:27:11 [tls_prune] DBERROR: opening
>>>>> /var/imap/tls_sessions.db:
>>>>> cyrusdb error
>>>>> """
>>>>>
>>>>> Somewhere I found a post from someone having the same problem. They
>>>>> said
>>>>> that they could get cyrus working again by removing all the files in
>>>>> /var/imap/db. So I did that. Now I can start cyrus and read my mail
>>>>> *PHEW*. However I now have a new raft of errors showing up in syslog.
>>>>> Everytime I start cyrus I get this:
>>>>> """
>>>>> Jan  8 18:01:42 [master] setrlimit: Unable to set file descriptors
>>>>> limit
>>>>> to -1: Operation not permitted
>>>>> Jan  8 18:01:42 [master] retrying with 1024 (current max)
>>>>> Jan  8 18:01:42 [master] process started
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] recovering cyrus databases
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] DBERROR db4: DB_ENV->log_flush: LSN
>>>>> past
>>>>> current end-of-log
>>>>>                 - Last output repeated twice -
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] DBERROR db4: Recovery function for LSN
>>>>> 1
>>>>> 182
>>>>> failed
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] DBERROR: dbenv->open '/var/imap/db'
>>>>> failed:
>>>>> Invalid argument
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] DBERROR: init /var/imap/db: cyrusdb
>>>>> error
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] DBERROR db4: environment not yet opened
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] DBERROR: opening
>>>>> /var/imap/mailboxes.db:
>>>>> Invalid argument
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] DBERROR: opening
>>>>> /var/imap/mailboxes.db:
>>>>> cyrusdb error
>>>>> Jan  8 18:01:42 [master] process 4105 exited, status 75_
>>>>> Jan  8 18:01:42 [master] ready for work
>>>>> Jan  8 18:01:42 [tls_prune] tls_prune: purged 0 out of 0 entries
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] checkpointing cyrus databases
>>>>> Jan  8 18:01:42 [ctl_deliver] duplicate_prune: pruning back 3 days
>>>>> Jan  8 18:01:42 [ctl_deliver] duplicate_prune: purged 2 out of 742
>>>>> entries
>>>>> Jan  8 18:01:42 [ctl_cyrusdb] done checkpointing cyrus databases
>>>>> """
>>>>>
>>>>> I don't *think* it's a permissions problem...
>>>>> """
>>>>> usr # ls -la /var/imap/db/.
>>>>> total 12952
>>>>> drwxr-xr-x    2 cyrus    mail         4096 Jan  8 21:15 .
>>>>> drwxr-xr-x   14 cyrus    mail         4096 Jan  9 16:45 ..
>>>>> -rw-r--r--    1 cyrus    mail            0 Sep 23 13:31 .keep
>>>>> -rw-------    1 cyrus    mail         8192 Jan  8 21:15 __db.001
>>>>> -rw-------    1 cyrus    mail       270336 Jan  8 21:15 __db.002
>>>>> -rw-------    1 cyrus    mail        98304 Jan  8 21:15 __db.003
>>>>> -rw-------    1 cyrus    mail     18063360 Jan  8 21:15 __db.004
>>>>> -rw-------    1 cyrus    mail        32768 Jan  8 21:15 __db.005
>>>>> -rw-------    1 cyrus    mail       286763 Jan  9 17:10
>>>>> log.0000000001
>>>>> """
>>>>>
>>>>> I'm using EXT3 so I don't think it's anything to do with the "chattr
>>>>> +S"
>>>>> stuff.
>>>>>
>>>>> I tryed recompiling cyrus.... it didn't help.
>>>>>
>>>>> The idea of cyrus dying fills me with dread. Someone please help me
>>>>> out
>>>>> here.... what is going on and how do I stop it? If someone can give
>>>>> me
>>>>> a
>>>>> bit of a shove in the right direction it would be MOST helpfull.....
>>>>> links... generall information.... even mild abuse, all gratefully
>>>>> accepted.
>>>>
>>>> Please search the list archives.
>>> I have... I used parts of the error message as keywords and didn't come
>>> up
>>> with much that appeared to be relevent. If you can suggest better
>>> keywords
>>> I will happily try again.
>>>
>>>> IIRC there have been two problems which
>>>> could create messages like yours. One problem is with NPTL on newer
>>>> RedHat
>>>> Linux and AMD and some other CPU's.
>>> I found some posts on this and did some reading. At first I got a bit
>>> excited becuase I have just changed from a PIII to a Athalon. Now I
>>> don't
>>> think that this is the problem. I compiled from source (I use Gentoo)
>>> so
>>> '--enable-posixmutexes' shouldn't be set. I also just tryed recompiling
>>> db4 with '--disable-posixmutexes'. AFAIK I didn't have to recompile
>>> cyrus
>>> after (please correct me if I'm wrong here). After a restart of cyrus
>>> I'm
>>> still getting the same messages.
>>>
>>>> The other problem is IIRC something
>>>> with the latest greatest DB4 release. Search for 'DB_ENV' and
>>>> 'dbenv->open', I'm quite sure you'll find out what's wrong.
>>> I can't find the second issue you reffer to.  When I search on those
>>> two
>>> terms alone I get a snow of documentation and when I add keywords like
>>> 'bug' or 'cyrus' I only find the NTPL problems.
>>>
>>> It occured to me after I posted this that I didn't give any versions of
>>> what I'm using....
>>> The system is running Gentoo and I'm using the packages marked as
>>> stable
>>> in the portage tree.
>>> DB4 = 4.0.14
>>> Cyrus = 2.1.15
>>> CPU = AMD T-bird 900
>>> Kernel = gs-sources 2.4.23_pre8-gss-r2
>>> If I've missed anything please let me know.
>>>
>>> The only other idea I've got from my hunting around is that it might be
>>> worth trying to manually rebuild/fix the db/ manually. That gave me
>>> nearly
>>> the same error:
>>> db_recover: DB_ENV->log_flush: LSN past current end-of-log
>>> db_recover: DB_ENV->log_flush: LSN past current end-of-log
>>> db_recover: Recovery function for LSN 1 182 failed
>>> db_recover: DB_ENV->open: Invalid argument
>>>
>>> Many thanks for the help but I'm afraid I need more :(
>>
>> Well, since you told us you moved your db files from another server, I
>> suspect you have a database mismatch. Cyrus-imapd can be compiled with
>> many different database backends and even the ondisk format of db3 and
>> db4
>> are not compatible. I'm quite sure that your database files (or one of
>> them) is not in the format your Cyrus-imapd expects.
>>
> Not a bad idea.... I have both db3 and db4 installed. To test I figured
> that I would try and run the db3 version of stat. I stoped cyrus, went
> into the db dir, ran the recoverand this was the result:
>
> db_stat: Program version 3.2.9 doesn't match environment version 4.0.14
> db_stat: DBENV->open: Invalid argument
>
> Then I tryed the db4 version and got this:
>
> 4.0.14  Environment version.
> 120897  Magic number.
> 0       Panic value.
> 0       References.
> 82      Locks granted without waiting.
> 0       Locks granted after waiting.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Mpool Region: 2.
> 264KB   Size (270336 bytes).
> -1      Segment ID.
> 8840    Locks granted without waiting.
> 0       Locks granted after waiting.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Log Region: 3.
> 96KB    Size (98304 bytes).
> -1      Segment ID.
> 1177    Locks granted without waiting.
> 1       Locks granted after waiting.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Lock Region: 4.
> 17MB 232KB      Size (18063360 bytes).
> -1      Segment ID.
> 7342    Locks granted without waiting.
> 0       Locks granted after waiting.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Txn Region: 5.
> 32KB    Size (32768 bytes).
> -1      Segment ID.
> 480     Locks granted without waiting.
> 0       Locks granted after waiting.
>
> I also found this post:
> http://www.netsys.com/openldap-software/2003/07/msg00477.html
> That warns against exactally what I did.... copy the db files from one
> place to annother. It's not the only post I've found warning about the
> importability of bdbs. I'm wondering how I can get myself out of all this.
> Though I'm at a loss to explain the details. I've never worked that much
> with the bdb stuff before. It all does *seem* to point to to the bdb
> enviroment.... and I have no idea what to do about it.
>
> Can I totaly regenerate the database, somehow tell remove all the files
> and tell cyrus to re-initalise the db? What would happen to the mail and
> it's attributes (seen/unseen/etc)?

IIRC you need at least a cleartest dump of the mailbox list db. Everything
else can be reconstructed with 'reconstruct' command.
I suggest using skiplist for what it is recommended in the 2.2 Beta. My
rpms do it and they even convert the databases on the fly on startup as
needed.
I'm really no expert with bdb but I try to stay away from is. My rpms
however do not convert db3 -> db4 files so if someone migrates from an
older RedHat release to a newer one, he can rsync the whole spool but has
to export db3 -> cleartext and then import cleartext -> db4 on the target
host.

Now you're not using RedHat and rpms so it won't help you much but you can
get an idea what happened.

Maybe someone on the list can give some more help as to what you can do
now because your system seems a bit messed up, right?

Simon

>
> Thanks again for all the help so far.....
>
>   Nick
>






More information about the Info-cyrus mailing list