Seen databases
Rob Mueller
robm at fastmail.fm
Wed Apr 28 18:44:54 EDT 2010
> So - in most cases there will be no $user.seen file any
> more. I'm wondering if there is actually any benefit in
> supporting three different operating modes for seen, or
> if we should standardise on one. The choices are either
> seen_db (advantage - less can go corrupt if anything
> goes wrong) or seen_bigdb (advantage - only one file,
> reduces the "stat" call and inode caching cost)
I think in this case, reducing options is a good idea. seen_local is legacy
anyway, and why have two other options, when as far as i can tell, everyone
always uses only one of them anyway.
So I'd say go with one option, means there's less variables and things to
debug anyway.
Whether to go seen_db or seen_bigdb, that's trickier. seen_db is what almost
everyone uses now, but seen_bigdb seems almost sane since in most cases, the
users own seen state will be in the cyrus.index.
There's one issue with seen_bigdb though, you really would have to use a
real DB (eg bdb or skiplist), not the text file db.
The other issue I can see, is that seen db is indexed by folder unqid. How
"unique" are folder id's. They're generated in a pretty adhoc fashion, and
it's always scared me that it might be too easy to generate clashes (when
restoring from backups especially), which would be especially bad for a
seen_bigdb.
Rob
More information about the Cyrus-devel
mailing list