performance on large inboxes

Phil Pennock info-cyrus-spodhuis at spodhuis.org
Wed Nov 8 12:09:24 EST 2006


On 2006-11-08 at 17:02 +0100, Marten Lehmann wrote:
> from time to time we have users with a very large inbox, which means it 
> contains 20.000 messages or even more. My quite general question is: 
> What is cyrus doing once a user logs in through imap or pop3? It seems, 
> that it is parsing the directory, which takes very long. But what does 
> it have the indices for?
> Of course I know that cyrus stores flags and headers in it, but why does 
> cyrus parse the directory if all parts that are fetched through the 
> session are the Date, To, From and Subject-header? Shouldn't that be in 
> the index so cyrus doesn't have to touch the directory except the 
> cyrus-files?

What is fetched depends upon the client software and what it asks for.

Some software asks for rather more, so you need to modify the server to
cache more.  Some software just asks for _all_ headers, eg MS Outlook.
For those, you have problems but I believe there are patches around to
cache all headers.

As long as the requested headers are those which Cyrus caches, Cyrus
won't need to scan the files.

For instance:

+    /* mutt wants: */
+    { "content-description", 3 },
+    { "content-type", 3 },
+    { "lines", 3 },
+    { "list-post", 3 },
+    { "message-id", 3 },
+    { "references", 3 },
+    { "x-label", 3 },
+
+    /* gbuffy wants: */
+    { "x-face", 3 },
+    /* squirrelmail wants: */
+    { "importance", 3 },

(and yes, message-id *should* be gotten from the ENVELOPE structure, not
 by asking for the header).

You need to look at imap/mailbox.c and imap/mailbox.h (to bump a minor
version number).

-Phil


More information about the Info-cyrus mailing list