need organizational hint

Michael Bacon baconm at duke.edu
Sat Apr 12 14:11:46 EDT 2003


Sorry to come in to this late -- I've been tied up all week.

First, with regards to what you're generally trying to do, I think the 
general consensus around here is that you're asking for trouble.  For most 
of your goals, if you try to impliment them in the way that you're 
considering, you're going to either spend a very long time writing a ton of 
code that will bloat Cyrus beyond recognition, or you're going to end up 
with your standards compliance in shreads.

I guess specifically I'm referring to the SMTP-directly-to-Cyrus bit. 
Before you go calling this a doable task, you really should go back and 
re-read RFC 2821.  You can't just include the SMTP functions you want to 
include and expect to be free of trouble; the reason MTAs are as big and 
complicated as they are is to handle all of the complexities of standards 
compliance correctly.  You're not doing yourself a favor by trying to 
replicate the functionality of one separately.

All of that said, the other reason not to go through all of this work is 
that there is *almost* enough existing, tested software to do exactly what 
you want to do, if you're willing to take the time to investigate.  I'll 
outline two slightly different ways of going about this here -- they'll 
both involve doing everything pretty much with Cyrus (of course) and 
Sendmail, because that's what we use here and that's what I know the best. 
It may be possible with one of the other MTAs as well -- I just don't know.


Strategy $1 -- This one is the first one I thought of reading this, and 
would require a modification of the lmtpd code.  The key bit here is that 
you run sendmail in interactive delivery mode (DeliveryMode=interactive in 
your sendmail.cf).  This does basically what you described in an earlier 
post -- sendmail makes the remote MTA hang on the line for the final return 
code of the DATA command while it goes ahead and makes one delivery 
attempt, which in this case is an LMTP connection to Cyrus.  If the 
delivery succeeds, the remote MTA gets a 250 code back that the message is 
accepted, and it never touches the queue.  If there's a temporary failure 
in delivery, sendmail returns a 250 to the remote MTA and queues the 
message locally for later retry.  If there's a permanent failure with 
delivery (e.g., 550 User rejects message), I *think* that gets passed back 
to the remote MTA, and the message is rejected.  I say I think because I 
don't have a good way to test this at the moment.  This keeps in line with 
standards compliance, because sendmail never actually returns the 250 
message until it queues, so if sendmail crashes in the middle of the 
transfer, the remote MTA never (better not!) dequeues the message, and will 
resend when the system recovers.  This is a bit of an oversimplification of 
the process, but hopefully you get the idea.

Now, as far as the user-configurable stuff, you can theoretically do pretty 
much everything you need with SIEVE.  You earlier implied that sieve can't 
reject a message, which is only partially true.  The "reject" action in 
sieve does exactly what you would want it to on a large scale basis.  On 
the other hand, and this is where the modification would need to be done, 
lmtpd currently doesn't do what you want it to when it encounters a sieve 
reject.  LMTP accepts the message provided the user exists, is under quota, 
and that sort of thing, then later processes it with sieve.  In the case of 
a reject, it does not write it to the mailbox, but instead makes an 
external call to sendmail to send a rejection in the MDN notification 
format.  Theoretically (and I haven't spent enough time with the lmtpd code 
to know how possible this is), you could move processing of the sieve 
script up in the chain, so that if a reject were encountered, you could 
send back a 5xx code to sendmail as the recipient-specific return code, 
which would (in interactive mode) get passed back to the remote MTA, and 
the message would be rejected.  Now, you'd be in violation of the SIEVE 
RFC, because you'd be sending back a DSN instead of an MDN, but I think you 
could live with that.  Of course, the code probably wouldn't get accepted 
into the cyrus distro, but if you're okay with being a lone wolf that way, 
then it doesn't matter much.

How does that information get passed to the users?  You convince sendmail 
to stamp all of the interesting information you want into the headers of 
the message, then get Sieve to filter on that instead.  You do this via the 
milter interface, which I get into more detail with below.

Anyway, that's the strategy that's closest to what you were describing. 
For a good bit more coding work, but neater coding work and more standards 
compliance, you might consider...


Straegy $2 -- You leave Cyrus alone, run sendmail in interactive mode, and 
write yourself a nice fancy milter to do everything you want to do.  I 
can't describe the milter interface in its entirety here, but effectively 
it's an API that allows you to run a threaded process external to the MTA 
that does the interesting filtering for you.  If done correctly, it has 
access to any and all information that sendmail has that you might want to 
filter on, including the envelope, the message headers, the message body, 
and connection information (such as the remote IP and such).  Filtering 
happens as the message is submitted, and the milter can halt the process at 
any point and reject or discard the message.  Furthermore, at the end of 
message submission, the milter may add headers, change headers, or replace 
the body of the message.  In short, you can filter mail about any way you 
could imagine with it, and you do it in-band upon message submission, so 
you can reject the thing if you want to.

So, if you really want to write in a bunch of user-specific stuff and 
manage the database yourself, you could just write all of that into a 
milter and run that alongside sendmail.  As a teaser, here at Duke, we've 
just written a generic regex-based milter that will hopefully take the 
place of procmail in our delivery setup, saving us tons of CPU cycles and 
giving performance a nice kick in the pants.  (It's still in the final 
stages of testing -- we'll hopefully be releasing it sometime soon.)  It 
doesn't do user-specific filters, but there's no reason you couldn't do 
something similar and write that in.


If you've really got the jones to code something, I'd go with #2.  Rather 
than re-inventing the wheel, you should be able to do everything you want 
with a little glue code, and still take advantage of the best of existing 
projects.  There may be a way to do something similar in Postfix, if you're 
really set on using that, but I'm not the one to help you there.

But I have to agree with all of the other sentiments expressed here. 
Wedging SMTP into Cyrus is a Bad Idea (tm).

Michael


--On Saturday, April 12, 2003 9:41 AM -0500 Phil Howard 
<phil-info-cyrus at ipal.net> wrote:

> On Sat, Apr 12, 2003 at 09:49:47AM -0400, Ken Murchison wrote:
>
>|
>|
>| Phil Howard wrote:
>| >
>| > Looks like I'll have to take a look at doing some sort of direct method
>| > of SMTP-to-Cyrus.  One option is an SMTP-to-LMTP front-end daemon that
>| > would include some filtering (perhaps just whatever Sieve doesn't do).
>| > The other would be modifying Cyrus to actually include SMTP capability.
>| > That would depend on how simply the LMTP daemon part is, either in how
>| > succintly it deals with mail storing details, or how well it
>| > encapsulates so those details are immaterial.  I won't know until I
>| > get the time to study the source code more fully.
>|
>| LMTP and [E]SMTP are indentical except for the hello commands, so making
>| lmtpd handle [E]SMTP is trivial, but I don't see the point.  Cyrus is
>| not designed to be an MTA, and I don't think you want to spend the time
>| doing so.
>
> I'm not actually considering this role to be that of at MTA.  Rather, it
> is the role of a final destination mail store that will generally be the
> only MX host for its served domains ... that will talk SMTP in that role.
>
> There are certain fundamental problems with having the email service split
> between a mail store and an MTA.  One of those is that there ends up
> having to be a lot of mail store functionality duplicated in the MTA,
> which the mail store could just as well do if it had the info, and was in
> control at the proper time.  In order to have a complete scope of spam
> control, whatever component is conducting the receiving end of an SMTP
> session has to have an understanding of end user wishes.
>
> And this is a special case of MTA where 100% of the inbound mail is only
> for delivery to the mail store.  Outbound mail won't be handled at this
> IP address and thus won't need to be handled by this MTA.  For outbound
> mail, I will be running a separate MTA which won't have any need to feed
> the mail store except for mail specifically addressed to a domain served
> by that mail store in which case it can send it by SMTP like any other
> MTA.
>
> Current MTAs lack the full ability to perform flexible spam controls as
> specified by the end user.  Either way I have to add something to some
> part of the whole mail server.  Adding all the spam controls to the MTA
> looks like a lot more work, given that it only queues mail, rather than
> do an SMTO to LMTP session pass through.  That means implementing not only
> full user awareness in the MTA, but also all the Sieve mechanism for those
> cases where the content is to be scanned by that method when the user
> wants that to be done for direct rejection.
>
>
>| > I take it no one else has any spam filtering going on that is both
>| > based on SMTP envelope, and is customized per user.
>|
>| Sieve does get the envelope, but it has no facilities for checking IP.
>| You might want to consider having your MTA do this type of SPAM checking
>| globally and adding an X-SPAM header which can then be filtered by
>| individual users via Sieve.
>
> Sending an X-SPAM header only works if the MTA has elected to accept the
> data being sent in the SMTP session.  The objective is to reject as much
> of the email as possible during the SMTP session by means of a 5XX
> response code, and avoid even the transmission of the data stream where
> that decision is final based on the
>
> Not all users want, or do anything with, a separate spam folder.  That
> just adds to the workload.  And in cases where legitimate mail goes into
> that folder, either the user has to spend all the same wasted time to
> look for it in there, or else the user ignores the spam folder.  In the
> latter case, you end up with legitimate mail not reaching the user but
> the sender thinking that since there is nothing bounced back, it has
> reached the user.  Ultimately, I think the user should make that decision
> and I believe many users will prefer to decide to refuse spam mail so
> that the sender gets a notification.  Users who have no intent to read
> their spam folder would be well advised to do it that way, and I want to
> given them that option.  I know that will be my own selection.
>
> User decisions end up having to be made at SMTP session time.  Having two
> software components involved is redundant.  Having duplicate user
> knowledge in an MTA is redundant.  By adding an SMTP capability to Cyrus,
> with the addition of user specified spam controls, redundancy is reduced.
> Whether this kind of spam control can be added to Sieve or not I don't
> know.  It would be nice if it were, but it looks like the standard
> language does not provide for that.  The best course I can see is a
> combination of controls where one set of specifications for SMTP metadata
> is used, and another set for content scanning is used, and the latter
> would have to be applied while the SMTP session is still working so the
> final response applies to the SMTP DATA command.  If Sieve can be made to
> do it at this time, that's certainly much better as it looks like an
> excellent way to specify content based controls.
>
> --
> -----------------------------------------------------------------
>| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
>| phil-nospam at ipal.net | Texas, USA | http://ka9wgn.ham.org/    |
> -----------------------------------------------------------------






More information about the Info-cyrus mailing list