cs at schulte.it
Fri May 9 16:29:02 EDT 2003
John Alton Tamplin wrote:
> Christian Schulte wrote:
>> That is, it contains ASCII-Characters > 127 ? So some client or
>> whatever introduced wrong header data because there can only be
>> headers containing 7-bit characters and 8-bit data can only be
>> contained encoded as per RFC2047 in 7 bit representation ?
> Correct. The message header is constrained to contain only ASCII
> characters because the header is what is used to define the character
> set for the rest of the message. RFC2047 gives a way to encode the
> data portions of the header in a self-contained format that is still
> ASCII to a mail client that doesn't understand RFC2047.
>> We assume the character > 127 is in the configured default charset
>> which could be taken from LC_ environment and try to convert the 8
>> bit data to a rfc2047 7bit representation leading to an incorrect
>> converted header if the default character set is not the correct one
>> ? How will you test against not converting to a wrong representation ?
> Rather than taking the language from an environment variable, it would
> be configured in imapd.conf. You can't test against it being
> incorrectly converted because there is no information to identify what
> character set was intended. Presumably, if the rest of the message is
> MIME and has a character set you could assume the header is the same,
> but the mail clients I have seen that support MIME support the proper
> encoding of the headers.
>> So we cannot mangle or reject messages any more which contain 8bit
>> header data not in the configured charset because we will convert
>> them to 7bit using another (maybe completely wrong) charset ? I'll
>> have to read other rfcs than rfc2047 to understand it completely, I
> The point is the message is already broken because it includes
> non-ASCII characters but no indication of what character set is in
> use. The mail system administrator making a guess for a default is as
> good as the mail client making a guess and doesn't require storing
> non-compliant messages in the mail store. The existing behavior in
> Cyrus is either to reject the message or mangle the offending
> characters into X.
Such functionality can/should be integrated in the MTA then, I think! I
tested sendmail's ability to do such things but got very strange results
so that I turned that features off after a few weeks. Mainly 90% of all
broken messages were not in my default iso-8859-15 charset and got
converted to something unreadable! It lead to some clients showing
totally crippled from or even subject headers where cyrus would have
only printed single X characters...
Of course 99% of all broken messages are spam. Installing
spamassassin lead to not getting that much broken messages any more...
My experiences with sendmail make me think that such a feature would not
introduce anythink really usefull then and cyrus allready handles broken
messages in an elegant way however!
More information about the Info-cyrus