cyrus-imapd 2.1.15, sieve, lmtpd, and return-path header

Tue Sep 23 18:13:49 EDT 2003

--On Tuesday, September 23, 2003 13:36:18 -0700 Chris Stromsoe 
<cbs at cts.ucla.edu> wrote:

> On Tue, 23 Sep 2003, Pat Lashley wrote:
>
>> --On Tuesday, September 23, 2003 02:18:50 -0700 Chris Stromsoe
>> <cbs at cts.ucla.edu> wrote:
>>
>> > out of ~40,000 messages I've got handy, between 50 and 60 had some
>> > variant of envelope-to (either plain or with x- or old-), and 36,906
>> > had a return-path header.  Return-Path is an rfc mandated header that
>> > must be handled.
>>
>> None of the messages should have had either of those headers when you
>> recieved them.
>
> It's very unlikely that any of the messages did have a Return-Path when
> they were handed to me.  I wasn't saying that they did.

It wasn't clear whether you were testing incoming messages or ones
already in mailboxes.  And your statement above certainly implies
that a bit less than 10% of the messages do NOT have a Return-Path
header.  Is that in fact the case; or was that 40,000 an overestimate?

>                                                           All of the
> messages containing envelope-to variants came that way.  (And, in fact,
> there is no reason why they shouldn't -- as you said yourself, envelop-to
> is not a standard header; it could be inserted at either end.)

Systems that do generate Envelope-To headers generally only do so when
they believe that the message is about to achieve final delivery; and
generally remove Envelope-To headers that already existed on incoming
messages, whether or not they will be passed on or delivered locally.
(On the theory that a pre-existing header may not be trustworthy.)

>> Yes, the RFC mandates a Return-Path header; but it specificly says
>> that it is to be generated only at the point of final delivery into
>> a file.
>
> Keep reading the RFC.  It also says that the message is considered
> delivered when it leaves the SMTP environment.  Ie, when lmtpd takes it,
> lmtpd should add the Return-Path header.

LMTP is a dialect of SMTP; so it hasn't left the SMTP environment
until sieve is through with it and lmtpd puts it into a cyrus mailbox.
(If, in fact, that is what sieve says is to happen to the message.)

> This seems very much like you're sniping at me.  I almost expected the
> sentence to end with ", junior" or something similar.  It's nice that you
> have a lot of experience doing software development and that you bring
> your experience to play.  However, I don't believe that your experience
> has anything to do with this issue.

I'm sorry that you feel that way.  I was merely trying to point out
that you are focusing on one half of the problem; and that your proposed
'solution' fails to address the other half.  Or even admit to its existance.

As to the applicability of my experience; that is what helps me to step
back and look at the big picture to identify the actual requirements
rather than the immediate request.

> The RFC _requires_ Return-Path.  It is standard.  Envelope-To is something
> that somebody made up.  I can't recall it being mentioned in any smtp rfc
> that I've read recently.  I can't recall seeing it generated by any MTA.
> You keep asserting that it has something to do with this and that I should
> somehow "know" how it fits into the discussion.  I don't believe that
> Envelope-To has anything to do with discussing whether or not _lmtpd
> should make sure that its in-memory data matches what it writes to disk_.

My actual assertion is that users, given the ability to filter on
the envelope sender, will also want to filter on the envelope recipient.
If you use Return-Path as the handle with which to check the former;
you must consider Envelope-To as the appropriate handle for checking
the latter.

I also assert that it is wrong to use those headers as the mechanism
for obtaining envelope values; and that the envelope extension to
sieve provides the right mechanism.

> The x-sieve header is applied regardless of sieve script processing.  If
> you enable sieve with cyrus, it is _always_ added.

But presumably, conceptually at least, -after- sieve has had a whack
at the message.  Note that the absence of a sieve script for a given
local-part does not mean that sieve doesn't process the message; only
that it uses the default action.

>  - lmtpd has written the received header to disk.  When that message is
> forwarded or bounced the on disk file is sent.  If on-disk format is
> canonical, then the header cache needs to have the headers too.  If the
> header cache is canonical, then the on-disk representation needs to not
> have those headers until the message is dispensed with.

In the case of the Recieved: header, the bounce or forward happens
after lmtp has 'accepted' the message for processing; so it is correct
to include that header in the bounced or forwarded message even if
it is not correct to have it yet when sieve processes it.  (See my
previous statement that argues that the Recieved: header should only
be added once lmtpd indicates to the MTA that it has accepted the
message for processing.)

A similar argument applies to the X-Sieve: header - by the time the
message is bounced or forwarded, sieve is done with it; so the header
would apply.

Return-Path should be filtered out of any bounce or forward because
the message has not, in fact, achieved final delivery; and therefore
should not conotain a Return-Path header.  (You can argue over whether
it shouldn't be in the file until final delivery or should be actively
filtered out for bounce/forward; but that's a minor implementation
detail.  Having it there optimizes the most common case - final delivery
to a cyrus mailbox.)

>  - sendmail provides a Return-Path header to procmail, the final delivery
> agent.  sendmail could also provide one to lmtpd, but lmtpd provides its
> own.  lmtpd is the "local mail transport protocol".  Like a local mailer
> in sendmail.  It is the final delivery step.

As such, it is a specialized dialect of SMTP; and is the proper
agent for adding the Return-Path header to messages which it
actually delivers into a local mailbox.  But -NOT- to messages
which are bounced or forwarded.

>                                                Sure, other things can
> happen after that; those things have nothing to do with anything that
> happened up to that point.

Of course they do.  As much as processing in any MTA has to do with
anything that happened up to that point.

>                             When sieve forwards or rejects a message, it
> invokes sendmail as a client.  It doesn't try to relay the message through
> the next hop or anything like that.  It reinjects the message.

An implementation detail.  And one that only applies to the case where
sieve is being used as part of a nominal final delivery.  Sieve integrated
into an MTA or MUA would handle it differently.

>> You are assuming that sieve processing occurs -AFTER- final delivery.
>> The authors of lmtpd apparently consider it to be happening -BEFORE-
>> final delivery; with the sieve script itself determining whether there
>> will even BE a final delivery at that point.
>
> Or, the authors of lmtpd never considered this situation and it wasn't a
> big deal having or not having the missing headers in the cache.  I'll let
> them speak for themselves.

I wish they would.  But I strongly suspect that they had a much more
global overview of mail processing than you are espousing.

>> It makes no sense to you because you continue to believe that it is a
>> shortcoming and not a correct behavour.  Stop trying to justify your
>> 'solution' and try to take an objective look at the problem. You will
>> find that it is solved admirably by the envelope extension.
>
> I objectively believe that lmtpd is maintaining an on-disk structure that
> does not match the structure that it is maintaining in-memory.

So what?  That's an lmtpd implementation detail that should be opaque
to you.  The on-disk version doesn't represent the message as it
currently exists to be processed by sieve; but the message as it
is expected to exist after sieve is done.

(I suspect that the file is written at that early point primarily
to have something to look at if something goes drasticly wrong and
lmtpd crashes before it finishes processing; and partially as an
optimization to prevent the need to write it out multiple times if
the sieve script requests multiple actions for it.  I welcome either
confirmation or refutation from the lmtpd authors.)

> I objectively believe that this is wrong.

No, you subjectively believe that it is wrong.  All beliefs are subjective.

> I noticed this when I was looking to find out why something I wanted
> to do wasn't working.

Wasn't working -as you expected it to-.  You seem to be completely
resistant to the idea that your expectations were wrong.

> The two things are _completely separate_.

No, they aren't.  This is where that 30+ years of experience helps
again.  I've had to deal with this -many- times.  (And freely admit
that I still need to be vigilant to avoid doing it myself.)  People
tend to get focused on a particular symptom and believe that it is
the problem.  It isn't always easy to step back and take a broader
look to discover the real underlying problem.

> lmtpd is the final delivery.  Read more of 2821:
>
>    When the delivery SMTP server makes the "final delivery" of a
>    message, it inserts a return-path line at the beginning of the mail
>    data.  This use of return-path is required; mail systems MUST support
>    it.  The return-path line preserves the information in the <reverse-
>    path> from the MAIL command.  Here, final delivery means the message
>    has left the SMTP environment.  Normally, this would mean it had been
>    delivered to the destination user or an associated mail drop, but in
>    some cases it may be further processed and transmitted by another
>    mail system.
>
> Which actually implies that lmtpd is wrong to be adding the Return-Path
> and that the smtp server handing off to lmtpd should be adding it.  Which
> is beside the point.

How is that beside the point?  As I pointed out above, LMTP is a
dialect of SMTP and the message has -NOT- left the SMTP environment
until and unless lmtpd actually places it into a cyrus mailbox.

>> Wasn't the basis for your original 'problem' that naive users could not
>> understand why they couldn't filter on the Return-Path header?
>
> No.  I didn't say that anywhere.  Are you reading what I'm writing or just
> looking to argue with me?

Perhaps I misunderstood your stated reasons for wanting to provide
the ability to filter on Return-Path instead of using the envelope
extension.

> Say again?  I know the functionality of what I want.  I don't care how it
> is implemented.  There are headers.  I want to be able to filter on all of
> them.  Not a subset of them, but all of them.  I'm not overly concerned
> with how that is implemented, just that it is implemented.

Again, you are focused too tightly on a particular symptom and ignoring
the bigger 'problem'.  The functionality you want is to filter on the
envelope sender.  The implementation that you want is to be able to
use the Return-Path header.  But the envelope extension provides another,
cleaner, implementation of the actual functionality that you desire.

Obviously you do care how it is implemented; because you are steadfastly
refusng to simply use the implementation that is already in place.

You can filter on all of the headers that actually exist at that point.
The Return-Path header does -not- really exist yet.  That it appears to
exist is an implementation detail that you shouldn't be aware of.  (I
realise that this point is a but subtle.)

> final delivery occurs when smtp passes the message to lmtpd.  What lmtpd
> does with it afterward is its own business.

That's where we disagree.  I claim that final delivery doesn't occur
until it is actually placed into a mailbox.  Apparently the authors
of the lmtpd code tended more towards my view than yours.

>> If there were a separate sieve parser, lmtpd would quite probably feed
>> it the message (without a Return-Path header) via a pipe;
>
> Look into how sendmail feeds procmail.  I use procmail to filter on
> return-path all the time.

It's been quite a while since I've used procmail; but I believe that
is a shortcomming of it's implementation.  Does it have any other
mechanism to get ahold of the envelope values?

> sieve opens the on-disk file and _feeds it to sendmail_ for redirecting.
> The on-disk file that _includes the Return-Path_.

As I said before, the Return-Path should probably be filtered out; or
not added to the file until the file is placed into a mailbox.

> So, is the on-disk file just an optimization?  Or is it the "final
> delivery" and being used as such by other parts of lmtpd?

As I've said before, my position is that final delivery doesn't occur
until the message is placed into a mailbox.

> Have you looked at the code to see what it's doing?  Is there some
> technical fault with what I'm proposing?  Or are you being disagreeable
> because you don't personally like the return-path header or because I've
> somehow offended your sensibilities?

I'm sorry if you see this as being disagreeable.  I am not in the
least offended by anything you've said yet.  BUT I do believe that
your proposal focuses narrowly on a particular symptom and attempts
to fix a non-problem.  If there is a real problem/bug, it is that
the Return-Path header is allowed to remain in bounced and forwarded
messages.  If not removed entirely, it should be converted into an
X-Original-Return-Path or something.  (I don't recall off-hand whether
there's an RFC-mandated heaer name for that purpose.)

-Pat