Cyrus database and file usage data

Thomas Jarosch thomas.jarosch at intra2net.com
Mon Jun 13 04:10:57 EDT 2016


On Monday, 13. June 2016 00:05:13 Bron Gondwana via Cyrus-devel wrote:
> If twoskip is too slow (possible), then I've been quite interested in
> looking at rocksdb (http://rocksdb.org/) as an embedded engine that has
> really good performance, prefix scanning, and a good community around it.
>  It's also quite compatible with object storage because all but the
> level0 "hot" databases are read-only, so you can store them as objects
> once and then not need to scan them again.

rocksdb sounds like an interesting piece of technology.
Secondary indexes look a bit tricky though.

Would the new cyrus databases benefit from secondary indexes at some level?

Good read:
https://voltdb.com/blog/foundationdbs-lesson-fast-key-value-store-not-enough

and:
https://eng.uber.com/schemaless-part-one/

> An alternative there is multi-level databases in the same way we have the
> search tiers - with offline repack and substituting a new database with
> identical contents (minus dead records) atomically in the way that we do
> it with search.  This eliminates the stop-the-world repacks that
> occasionally hit us with both cyrus.index/cyrus.cache and all the
> twoskip/skiplist databases, because repack can be done in the background
> to new read-only files, with all writes happening to a small level0
> database.

I absolutely hate it when a routine git command suddenly takes ten minutes 
to complete because it's doing a large repack on the server side.

-> offline repack is a great idea

> It is a massive change to the on-disk data formats!  We'd be left with
> basically:
> 
> * key value stores
> * cache format (multiple fixed-length binary items per file with file
> number + offset addressing) * rfc822 messages (either stick with
> one-file-per-message or do some MIX style multiple-per-file - this can be
> independent)

another thing to consider: Data corruption. How will the new
code / databases deal with different levels of data corruption?

One of the complains that I read about cyrus vs. dovecot is that dovecot is 
self-healing when it's discovering a corrupted mailbox while cyrus-imapd 
requires an admin to run reconstruct & friends. I still need to run 
reconstruct every six months or so when a power outage corrupted a db file.


May be some ideas can also be found here:

http://wiki.dovecot.org/Design/Indexes
http://wiki2.dovecot.org/Design/Indexes/TransactionLog
http://wiki.dovecot.org/IndexFiles
http://wiki2.dovecot.org/MailboxFormat/dbox

Personally I've never seen the idea of a "lockless integer" before,
but it looks like a neat trick.

Cheers,
Thomas



More information about the Cyrus-devel mailing list