Updating /seen from concurrent sessions

Andrew McNamara andrewm at object-craft.com.au
Thu Nov 14 20:52:50 EST 2002


>Try using skiplist for the seen.db
>It doesn't really solve the problem but it masks it well enough.
>
>From my understanding, changing to skiplist really shouldn't change
>the visible behavior at all. But I've been wrong before.

I'll try to test it here and let you know. My reading of the code suggests
it shouldn't change the specific problem I'm seeing.

What's the general feeling on the skiplist implementation used in
conjunction with Sun and NetApp's NFS (we're locked in to using this
combination for various reasons)? Would you be more or less likely to
trust it over db3?

Another question - it looks to me like I have to recompile to switch
database types - is this true? The code looks like it would be flexible
enough to allow a run-time config option to chose the method with very
little modification?

>It would be possible to flush the seen state more often; it's just a
>question of how often and when should other imapds look for it. 

If the imapd already can cope with asynchronous events, I would flush the
state after a second or two of inactivity from the client. Failing that,
I would probably flush the state before replying to the client (yes,
this would hurt performance, although probably not much, particularly
if we skip the fsync()).

But this just fixes the OE problem - Cyrus would still have a problem
(as far as I can see): all the other copies accessing that mailbox
will still have their old seen files open (maybe using skiplist fixes
this). The flat-file seen implementation needs to check to see if the
file has been renamed under it (and do what?).

To be honest, the flat file seen implementation is way more complicated
than I would have thought was worthwhile. My preference would be to
not hold the file open, and simply re-write the whole file each time we
updated it, renaming the replacement into place (to make the operation
atomic - this is also the only synchronous operation). My experience has
been that unix is quite happy doing naive things like this while the
file remains small (say less than 10k).

I implemented a Postfix map that works this way - for lookups, it simply
does a linear read/search of the file. For update, it writes a new file,
and moves it into place. Generally this performed much better than
more complex schemes such as the Sleepycat DB's - particularly when you
consider memory footprint (this was on a machine with about 100k users,
handling 10's of messages per second).

>I've never actually seen this problem happen whenever I've fooled around
>with OE so I've never looked at the code to figure out what to do.

I get the impression it's a specific OE usage pattern that triggers
it. I've had it described to me as "send a mail, click the <send/check>
button", which sounds common enough to me. 

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/




More information about the Info-cyrus mailing list