Webmail applications that are kind to the IMAP server?
Cyrus Daboo
daboo at cyrusoft.com
Mon Sep 2 13:55:57 EDT 2002
Hi,
--On Monday, September 2, 2002 6:40 PM +0200 Sebastian Hagedorn
<Hagedorn at uni-koeln.de> wrote:
| We haven't been able to find any in our evaluation of webmail systems.
| We've chosen IMP for our webmail (<https://webmail.uni-koeln.de>) in
| spite of this issue. What bugs me is that the PHP module that's used for
| IMP seems to have a flag that basically says "use persistent
| connections". I have no idea why the developers chose not to use it ...
Because it does not work. There is a fundamental problem with using
apache+PHP to do webmail. The problem is the process model that apache has
- basically an http request can be handled by any of the pool of processes
that apache maintains. Thus its not possible to ensure that a particular
user gets routed to a specific apache process which maintains a persistent
connection for them.
Even if you could ensure a specific user always got back to the same apache
process, the OP_RELOGIN option in the php_imap.c code does not guarantee
any real performance benefit. Basically it assumes the ability to
'reauthenticate' to an existing IMAP connection as a different user - well
IMAP does not support that, and probably never will officially due to
security concerns. Even if IMAP did support that, you need to understand
where the load is coming from. If it turns out the authentication is
causing the load, then the OP_RELOGIN gains you nothing.
The main areas that need to be looked at for performance in this case are:
1) The initial tcp connection between the webserver and imapserver. This
can easily be optimised in a closed environment, with the two servers
either on the same or different machines,
2) The authentication process. This will likely depend on the
authentication method being used, but it really shouldn't be a major
performance issue.
3) The mailbox 'open' (SELECT/SORT/FETCH etc). This is likely to be the big
performance hit. This will be very dependent on the IMAP server. For
example, the UW server has to reparse the entire mailbox for each SELECT,
thus performance gets noticeably worse the larger the mailbox is. The CMU
server caches message data, so SELECT is nearly independent of mailbox
size. If the user has set their webmail to sort by anything other than
message number, then a SORT command will be used. Both CMU and UW will have
to do the sort each time - they don't make any attempt to cache sort order.
What is really required for a persistent connection webmail system using
apache is in effect a full imap client on the webserver for each user. That
client would then maintain the persistent connection keeping the mailbox in
the 'open' state, eliminating the need to re-authenticate, select, sort
etc. Since that does not fit the apache process model you would likely have
to have these clients run as separate processes and then have apache
processes communicate with those to generate the html sent to the user.
This is the 'imap gateway' type approach described by the Squirrel Mail
people. Clearly it is a major engineering task to do and will increase the
load of the webserver way beyond what it is now.
There are some webmail solutions that do maintain persistent connections -
those are based on other http servers that have a different process model
that allows that.
PHP does use persistent connections in other area to great benefit - e.g.
database connections - but those don't have the same requirements for
authentication etc that IMAP does so it doesn't matter which apache process
handles a request.
In any event, I think you can do reasonably well with IMP provided you take
time to tweak your system for performance - looking at the three areas I
described above is a good start.
--
Cyrus Daboo
More information about the Info-cyrus
mailing list