Seen databases

Bron Gondwana brong at fastmail.fm
Wed Apr 28 08:19:25 EDT 2010


At the moment Cyrus appears to support 3 seen backends:

* seen_local:
    stores all the seen data for all users in a file in
    the spool directory.  Legacy.

* seen_db:
    as far as I can see, everyone uses this.  It's the only
    one that replication's SETSEEN_ALL command works with
    for sure.

* seen_bigdb:
    one single database for ALL users seen data.

Now - I'm in two minds.  I've already made one HUGE change
to how seen is handled, in that it's a system_flag in the
index record for the owner of the mailbox for user.*
mailboxes now.  Also recentuid is in the index header for
the owner.  This catches 99% of cases, reducing IO, since
compulsory CONDSTORE means we're always updating the
record for seen changes anyway.

So - in most cases there will be no $user.seen file any
more.  I'm wondering if there is actually any benefit in
supporting three different operating modes for seen, or
if we should standardise on one. The choices are either
seen_db (advantage - less can go corrupt if anything
goes wrong) or seen_bigdb (advantage - only one file,
reduces the "stat" call and inode caching cost)

For that matter - if we standardised all $user.sub files
into a subscription.db, we'd cut yet another bunch of
tiny files.  I'll probably leave that one alone for now,
since otherwise these changes will get totally out of
hand...

Speaking of which, I'm probably due to write another
update on how my future branch work is going!

Anyway - the reason I'm writing this is: I can see
that I'm going to need to provide a "seen_user_foreach"
API which calls a function with each given seen record
name... and I'm wondering if I should write 3 or just
not bother and standardise on one.

Bron.


More information about the Cyrus-devel mailing list