cyrus-murder problems with database corruption in the frontend/master
João Assad
jfassad at parperfeito.com.br
Thu Apr 7 23:47:08 EDT 2005
João Assad wrote:
> Derrick J Brashear wrote:
>
>> On Thu, 7 Apr 2005, João Assad wrote:
>>
>>> Ok I got a backtrace ( I think ) . I dont really know how to use gdb
>>
>>
>>
>> did you compile without giving gcc the -g option? Probably. Having
>> unstripped binaries with useful symbols would probably make for a
>> more useful backtrace. (at least i hope so)
>
>
> recompiled 10 more times with -g, -ggdb .. tried using cyrus source..
> tried the fedora package to build the debuginfo package. still get the
> same output from gdb
> and the binaries arent striped.
>
Managed to get a backtrace using debug_command ( thanks for this nifty
feature Henrique de Moraes )
I removed the first 18988 records because it was only repeating the
assertion error
#18988 0x0804cdcb in fatal (
s=0x2c60a080 "Internal error: assertion failed: mupdate.c: 585: 0",
code=75) at mupdate.c:585
#18989 0x0807dd39 in assertionfailed (file=0x807e1d9 "mupdate.c", line=585,
expr=0x807e401 "0") at assert.c:61
#18990 0x0804cdcb in fatal (
s=0x2c60a4d0 "failed to mmap /var/lib/imap/mailboxes.db file", code=75)
at mupdate.c:585
#18991 0x080718ac in map_refresh (fd=11, onceonly=0, base=0x8ece710,
len=0x8ece714, newlen=8978432,
name=0x8ece740 "/var/lib/imap/mailboxes.db", mboxname=0x0)
at map_shared.c:105
#18992 0x080741d5 in update_lock (db=0x8ece708, txn=0x2c60a6f0)
at cyrusdb_skiplist.c:567
#18993 0x08076211 in mycommit (db=0x8ece708, tid=0x2c60a6f0)
at cyrusdb_skiplist.c:1340
#18994 0x08075cd1 in mystore (db=0x8ece708,
key=0x8ee1c10 "user.1038786._SAVE", keylen=18,
data=0x8ee4128 "1 cyrus-be3.gazzag.com!default 1038786\tlrswipcda\t",
datalen=49, tid=0x0, overwrite=1) at cyrusdb_skiplist.c:1220
#18995 0x08075d35 in store (db=0x8ece708, key=0x8ee1c10
"user.1038786._SAVE",
keylen=18,
data=0x8ee4128 "1 cyrus-be3.gazzag.com!default 1038786\tlrswipcda\t",
datalen=49, tid=0x0) at cyrusdb_skiplist.c:1239
#18996 0x080559d2 in mboxlist_insertremote (
name=0x8ee1c10 "user.1038786._SAVE", mbtype=0,
host=0x8ee2728 "cyrus-be3.gazzag.com!default",
acl=0x8ee3230 "1038786\tlrswipcda\t", tid=0x0) at mboxlist.c:791
#18997 0x0804e811 in database_log (mb=0x8ee3220, mytid=0x0) at
mupdate.c:1299
#18998 0x0804ef27 in cmd_set (C=0x9400b88, tag=0x8ee0f20 "X191648",
mailbox=0x8ee1000 "user.1038786._SAVE",
server=0x8ee1070 "cyrus-be3.gazzag.com!default",
acl=0x8ee1598 "1038786\tlrswipcda\t", t=SET_ACTIVE) at mupdate.c:1526
#18999 0x0804d2cc in docmd (c=0x9400b88) at mupdate.c:684
#19000 0x0804e64a in thread_main (rock=0x0) at mupdate.c:1227
#19001 0x0069b98c in start_thread () from /lib/tls/libpthread.so.0
#19002 0x005897da in clone () from /lib/tls/libc.so.6
Ok I can reproduce this error everytime by simply using the following
values and then syncing any of my backends to the master server.
mupdate_connections_max: 1024
mupdate_workers_start: 100
mupdate_workers_minspare: 100
mupdate_workers_maxspare: 500
mupdate_workers_max: 1024
On production, I use the following values:
mupdate_connections_max: 1024
mupdate_workers_start: 40
mupdate_workers_minspare: 16
mupdate_workers_maxspare: 80
mupdate_workers_max: 400
Which takes loger for the mmap failure to happen. So I'll send another
backtrace when I get the error on the production server.
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
More information about the Info-cyrus
mailing list