cyrus-murder problems with database corruption in the frontend/master

João Assad jfassad at parperfeito.com.br
Thu Apr 7 23:47:08 EDT 2005


João Assad wrote:

> Derrick J Brashear wrote:
>
>> On Thu, 7 Apr 2005, João Assad wrote:
>>
>>> Ok I got a backtrace ( I think ) . I dont really know how to use gdb
>>
>>
>>
>> did you compile without giving gcc the -g option? Probably. Having 
>> unstripped binaries with useful symbols would probably make for a 
>> more useful backtrace. (at least i hope so)
>
>
> recompiled 10 more times with -g, -ggdb .. tried using cyrus source.. 
> tried the fedora package to build the debuginfo package. still get the 
> same output from gdb
> and the binaries arent striped.
>
Managed to get a backtrace using debug_command ( thanks for this nifty 
feature Henrique de Moraes )

I removed the first 18988  records because it was only repeating the 
assertion error

#18988 0x0804cdcb in fatal (
    s=0x2c60a080 "Internal error: assertion failed: mupdate.c: 585: 0",
    code=75) at mupdate.c:585
#18989 0x0807dd39 in assertionfailed (file=0x807e1d9 "mupdate.c", line=585,
    expr=0x807e401 "0") at assert.c:61
#18990 0x0804cdcb in fatal (
    s=0x2c60a4d0 "failed to mmap /var/lib/imap/mailboxes.db file", code=75)
    at mupdate.c:585
#18991 0x080718ac in map_refresh (fd=11, onceonly=0, base=0x8ece710,
    len=0x8ece714, newlen=8978432,
    name=0x8ece740 "/var/lib/imap/mailboxes.db", mboxname=0x0)
    at map_shared.c:105
#18992 0x080741d5 in update_lock (db=0x8ece708, txn=0x2c60a6f0)
    at cyrusdb_skiplist.c:567
#18993 0x08076211 in mycommit (db=0x8ece708, tid=0x2c60a6f0)
    at cyrusdb_skiplist.c:1340
#18994 0x08075cd1 in mystore (db=0x8ece708,
    key=0x8ee1c10 "user.1038786._SAVE", keylen=18,
    data=0x8ee4128 "1 cyrus-be3.gazzag.com!default 1038786\tlrswipcda\t",
    datalen=49, tid=0x0, overwrite=1) at cyrusdb_skiplist.c:1220
#18995 0x08075d35 in store (db=0x8ece708, key=0x8ee1c10 
"user.1038786._SAVE",
    keylen=18,
    data=0x8ee4128 "1 cyrus-be3.gazzag.com!default 1038786\tlrswipcda\t",
    datalen=49, tid=0x0) at cyrusdb_skiplist.c:1239
#18996 0x080559d2 in mboxlist_insertremote (
    name=0x8ee1c10 "user.1038786._SAVE", mbtype=0,
    host=0x8ee2728 "cyrus-be3.gazzag.com!default",
    acl=0x8ee3230 "1038786\tlrswipcda\t", tid=0x0) at mboxlist.c:791
#18997 0x0804e811 in database_log (mb=0x8ee3220, mytid=0x0) at 
mupdate.c:1299
#18998 0x0804ef27 in cmd_set (C=0x9400b88, tag=0x8ee0f20 "X191648",
    mailbox=0x8ee1000 "user.1038786._SAVE",
    server=0x8ee1070 "cyrus-be3.gazzag.com!default",
    acl=0x8ee1598 "1038786\tlrswipcda\t", t=SET_ACTIVE) at mupdate.c:1526
#18999 0x0804d2cc in docmd (c=0x9400b88) at mupdate.c:684
#19000 0x0804e64a in thread_main (rock=0x0) at mupdate.c:1227
#19001 0x0069b98c in start_thread () from /lib/tls/libpthread.so.0
#19002 0x005897da in clone () from /lib/tls/libc.so.6

Ok I can reproduce this error everytime by simply using the following 
values and then syncing any of my backends to the master server.

mupdate_connections_max: 1024
mupdate_workers_start: 100
mupdate_workers_minspare: 100
mupdate_workers_maxspare: 500
mupdate_workers_max: 1024

On production, I use the following values:

mupdate_connections_max: 1024
mupdate_workers_start: 40
mupdate_workers_minspare: 16
mupdate_workers_maxspare: 80
mupdate_workers_max: 400

Which takes loger for the mmap failure to happen. So I'll send another 
backtrace when I get the error on the production server.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html




More information about the Info-cyrus mailing list