Subject: Bounced Mail Due to Postfix/Cyrus conflict

Jon.Rifkin at UConn.EDU Jon.Rifkin at UConn.EDU
Thu Feb 15 21:00:25 EST 2007


Subject: Bounced Mail Due to Postfix/Cyrus conflict


INTRO

This message describes a problem between Postfix and Cyrus, and describes
a workaround which worked for us.  The problem is caused by a
mis-communication between Postfix's lmtp client and Cyrus's lmtpd
daemon, and is triggered when Cyrus's lmtpd daemon has too many
'lockers' locking its database.  The result is that Postfix fails
to deliver valid email, instead returning it to the sender.

I'm posting this to (1) help users who are experiencing similar
problems, and (2) alert Postfix and Cyrus developers to this problem.


FIRST SYMPTOM

For each returned email, our log showed lines like this,

  Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>,
orig_to=<...>,
   relay=public/lmtp[public/lmtp], delay=1, status=bounced
   (host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA
command))

The mystery was, why is postfix/lmtp bouncing mail?  Didn't it receive
an OK message ("250 2.1.5 ok")?


THE ACTUAL PROBLEM

Following the on-line advice, we turned on verbose messages for
lmtp, replacing "lmtp" with "lmtp -v"  in the appropriate column in
/etc/postfix/master.cf, and then re-examined our logs.

The cause of our failed emails was then narrowed to the conversion
between Postfix's lmtp client, and Cyrus's lmtpd daemon.  The problem
happened in two stages.

In the first stage, Cyrus's lmtpd would report a problem with too many
'lockers',

 Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250
2.0.0 ok
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]:
MAIL FROM:<...> SIZE=26455
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]:
RCPT TO:<...>
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250
2.1.0 ok
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250
2.1.5 ok
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 354
go ahead
 --C1>>
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: .
 Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 350
lockers

As a result, Postfix defers the message for later delivery (note that
below the status says 'bounced', but later log entries (not shown) show
that this messages was retried and eventually delivered).  Here's the
bounce message give in this stage,

 Feb 13 13:05:51 HOST postfix/lmtp[11340]: 45ACD380F0: to=<...>,
   relay=public/lmtp[public/lmtp], delay=5, status=bounced
   (host public/lmtp[public/lmtp] said: 350 lockers (in reply to end of
DATA command))


In the next stage, the conversation continues, and an attempt is made to
deliver a second email.  Below is the rest of the conversation between lmtp
and lmtpd.

 --C2>>
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RSET
 <<R1--
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250
2.1.5 Ok
 --C3>>
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]:
MAIL FROM:<...>
 --C4>>
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]:
RCPT TO:<...>
 --C5>>
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA
 <<R2--
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250
2.0.0 ok
 <<R3--
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250
2.1.0 ok
 <<R4--
 Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250
2.1.5 ok
 Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>,
orig_to=<...>,
   relay=public/lmtp[public/lmtp], delay=1, status=bounced
   (host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA
command))

The lines following the marks like --C1>> and <<R1-- are the commands
given by lmtp, and the responses returned by lmtpd.

Here, Postfix/lmtp bounces the email because in response to its DATA
command (marked --C5>> above), it receives the reply "250 2.1.5 ok"
(marked <<R4-- above), but it rightfully expects instead "354 go ahead".

Postfix's lmtp gets the wrong response because the responses are out of
sync.  You can see this by following the sequence of replies.  The first
command from lmtp is the '.' command (--C1>> above) signaling the end of
input to the DATA command.  The response to this <<R1--, although it
*appears* to be the repines to the RSET command.  By following the
sequence you can see that lmtp thinks that "250 2.1.5 ok" is an invalid
response to the "DATA" command, but it's really the valid response to
the "RCPT TO:" command.

Note that last line above.  It shows lmtp bouncing the email.  While
this bounce looks similar to the first bounce, these types of bounces
were always returned to sender.


OUR WORKAROUND

Our workaround for this problem is relatively simple.  We got Cyrus's
lmtpd client to stop issuing the 'lockers' message by turning off
'duplicatesuppression' (it is on by default), by setting

   duplicatesuppression: 0

Duplicate suppression compares every incoming
email to a list of previous emails in the duplicate database, and
rejects those that are duplicates for any particular user.  Apparently
this process is what was generating the 'lockers' messages.  Since we
have no need of this feature, turning it off is an easy workaround for
us.


WHAT IS THE RIGHT FIX?

Should Postfix's lmtp client re-queue the messages that it currently
returns?  Or, should Cyrus's lmtpd daemon do a better job with its error
messages back to Postfix's lmtp?




More information about the Info-cyrus mailing list