cyrus and 500k to >1M mails per inbox
brong at fastmail.fm
Thu Feb 11 15:49:09 EST 2010
On Fri, Feb 12, 2010 at 07:37:20AM +1100, Rob Mueller wrote:
> >>> It's quite likely, that these mailboxes will grow to 500000 or
> >>even > more then
> >>> 1M mails per mailbox.
> >>> Does anybody have experience with such big mailboxes?
> >>Is the I/O cost of message adding relative to O(n), n being the
> >>number of msgs
> >>already in the mailbox, or is it independent thereof ?
> >Stacks of stuff is O(n), sorry. It's not really an ideal way to
> >do things.
> >You'd be better off archiving per month for each mailbox and keeping the
> >messages counts to a reasonable number.
> Well it's not quite as bad as that.
> If the mailbox is mostly just appended to, then it's not a problem.
> The append code just appends to cyrus.index and cyrus.cache I
> believe (Bron?), it doesn't have to read the cyrus.* files each
It also has to MMAP the the cyrus.index and cyrus.cache files, which
can be an issue on 32 bit architectures due to process size limitations.
It's amazing how quickly you can process a million records on a modern
machine as well!
> We have actually seen a mailbox with > 1M messages at FastMail, and
> I remember that I could view it ok. However, I'll add a few caveats
> around that.
> 1. Our web interface has been highly optimised over time to make
> sure it deals ok with large UID lists. I'm not sure how many email
> clients will handle viewing an IMAP folder with 1M messages
> 2. You need a good filesystem as well, because there will be 1M
> files in a directory. We use reiserfs3, and again, it has no problem
> with that many files in a directory, mileage with other filesystems
> might vary.
Yep - and as I said with ever removing old stuff - you'll hit pretty
high expunge costs (I don't know how filesystem unlink cost scales
with the size of the directory entry... probably depends on the
filesystem a fair bit too!)
More information about the Info-cyrus