seen db

Bron Gondwana brong at fastmail.fm
Tue Jun 10 09:42:08 EDT 2008


On Tue, 10 Jun 2008 15:29:01 +0200, "Rudy Gevaert" <Rudy.Gevaert at UGent.be> said:
> Bron Gondwana wrote:
> 
> > Are you running a 64 bit kernel?
> 
> Yes, but the system is 32bit (I run 64bit kernel  + 32 emulation support)

Interesting, so do we (on etch as well)

> > (just wondering - we have hit pretty much the same issue - and were
> > wondering about dodgy kernel issues being a proble - it's only one
> > machine that seems to have corrupted seen files, only on replicas)
> 
> 
> > We've been running 2.3.12 for about a week, and it's only last night
> > that we had anything funny show up at all.
> > 
> > Interestingly, it's probably the first time cyr_expire ran on 2.3.12
> > just before that - and also the first time our check-replication
> > script was running, which loads a lot of seen files on BOTH ends.
> 
> Here cyr_expire has been running on 2.3.12 for a couple of weeks.  But 
> here the first time too with the 64bit kernel.

There you go.  We've had the 64bit kernel approximately forever, but only
just upgraded from 2.6.20 series to 2.6.25.

> I can try with a 32bit kernel tomorrow.
> 
> In attachment a strace to show where it segfaults

Almost certainly boring, since it's file corruption.  The file itself would
be significantly more interesting.  My guess - you'll be finding little blocks
of (small n)*4 bytes which happen to be NULL.  It's when they intersect with
the pointers table that things get interesting.

Oh - can you tell me.  Did the file checkpoint sometime not too long before it
got corrupted?

I've got a small set of theories, but I'm reading the skiplist source code
(again!) to see if they make sense...

Bron.
-- 
  Bron Gondwana
  brong at fastmail.fm



More information about the Info-cyrus mailing list