IMAP/LMTP delays (was Re: IMAPD Mail Delays)

Michael D. Sofka sofkam at rpi.edu
Wed Sep 5 15:16:45 EDT 2007


On Tuesday 04 September 2007 03:42:05 pm Corey Bobb wrote:
> The current problem I am having is that I am receiving mail from my outside
> email relay (postfix) which From what I gather forwards it over the cyrus
> mail sever however, I can see the postfix server getting the mail . ..but
> it will take another 30 minutes before the message can actually be viewed
> in a mail client.

I've been seeing this as well.  In our case, it's cyrus 2.2.12 with
sendmail 8.13.1 using MAILER(cyrus2).  Cyrus is run in a murder cluster with a 
master, two front-end and three back-end servers (one not in use---old, being
retired, running account creation scripts which need to be moved).

Sendmail reports: "Deferred: Connection timed out with mail.rpi.edu."
And, occasionally it reports: "Deferred: Broken pipe."  (mail.rpi.edu is a 
vip.  Port 25 connections are sent to one of four smtp machines via a Cisco
CSS director.)

The 60 minute delay (in our case) was due to the MinQueueAge parameter in
sendmail.  Once email was deferred to one of the smtp servers, that server 
continued to queue email for 15 minutes (when the .hoststat is rechecked).

The queue runners would skip the queued email for 1 hour.   To patch this, I 
put the local email in a separate queue group, and run sendmail out of cron 
with MinQueueAge value of 10 minutes.  (Putting a runner on the queue group
didn't work, since MinQueueAge takes precedence.)

But, the base problem remains:  For some reason lmtp is returning a 400-level 
dsn.

Perhaps related, or not, I have also noticed occasional delays connecting to 
IMAP.  They tend to cluster around 15 minutes before the hour.  It got 
particularly bad when students returned last week.

Google turned up a mention of an error in mupdate which caused it to double 
free threads when the maximum connections is reached.  Indeed, there were
numerous "could not start a new worker thread" errors in the cyrus.log file on 
the murder master.   I doubled mupdate_connections_max to 2048 and the
problem is reduced.  The delays are still there, but they do not last as long.
There are still a few 'could not start worker thread' errors.   The process
involved restarting services on the master, which killed about 800 mupdate
connections.   Currently there are about 26 mupdate connections.

Any help with this (these?) problem(s) appreciated.

Mike

-- 
Michael D. Sofka               sofkam at rpi.edu
C&MT Sr. Systems Programmer,   Email, TeX, Epistemology
Rensselaer Polytechnic Institute, Troy, NY.  http://www.rpi.edu/~sofkam/


More information about the Info-cyrus mailing list