Sergio Devojno Bruder
bruder at haxent.com.br
Mon May 8 14:33:11 EDT 2006
Ken Murchison wrote:
> Bernhard Reiter wrote:
>> On Mon, Apr 10, 2006 at 05:17:56PM +0200, Bernhard Reiter wrote:
>>> On Sun, Apr 09, 2006 at 10:54:24PM -0400, Ken Murchison wrote:
>>>> Bernhard Reiter wrote:
>>>>> On Fri, Apr 07, 2006 at 05:24:53PM -0400, Ken Murchison wrote:
>>>>>> Martin Konold wrote:
>>>>>>> Can you explain how a dying process can create such a broken
>>>>>>> skiplist db?
>>>> I've already asked the person that wrote the code to take a look and
>>>> share his thoughts.
>>> Thanks, we are looking forward to it.
>> were there any ideas from the person you have asked?
> No. I assume he's busy at Google.
We already saw this 'dying process corrupting skiplist db' a lot, you
reproduce it with a really big db (>5-7M mailboxes) with little memory
(<= 1GB) in a linux box easily, some mmap operations will fail ENOMEM
and the process will give up with a resulting broken db.
NOTE: mmap will fail with ENOMEM in linux with free memory and lots of
We already saw another type of problem with SMP (2 x Xeon with HT, 4
'processors' for linux) (Cyrus 2.2.10), resulted in corruption too (and
all problems with sincronization between frontends, mupdate and
backends). We "solved" it running a UP kernel on that same box, mupdate
doesnt need all that cpu power. IE: There is a race there.
More information about the Cyrus-devel