Sieve doesn´t filter emails with words with accents

Kjetil Torgrim Homme kjetilho at ifi.uio.no
Thu Jun 15 21:09:27 EDT 2006


On Thu, 2006-06-15 at 16:33 +0200, Sandy Drobic wrote:
> Sebas PRE wrote:
> > I would like to create a sieve filter to deliver to a folder all mail with
> > "leído:" in the Subject: header, but it does not work.
> 
> Because 8 bit characters are not allowed in header lines. You need to look 
> for the encoded equivalent.

that's not correct.  the Sieve interpreter should decode headers as per
RFC 2047 (or RFC 2231 as appropriate) into Unicode.  I quote from RFC
3028:

2.7.2.   Comparisons Across Character Sets

   All Sieve scripts are represented in UTF-8, but messages may involve
   a number of character sets.  In order for comparisons to work across
   character sets, implementations SHOULD implement the following
   behavior:

      Implementations decode header charsets to UTF-8.  Two strings are
      considered equal if their UTF-8 representations are identical.
      Implementations should decode charsets represented in the forms
      specified by [MIME] for both message headers and bodies.
      Implementations must be capable of decoding US-ASCII, ISO-8859-1,
      the ASCII subset of ISO-8859-* character sets, and UTF-8.

   If implementations fail to support the above behavior, they MUST
   conform to the following:

      No two strings can be considered equal if one contains octets
      greater than 127.

Cyrus is allowed to not match on accented characters as per the last
stanza, but it clearly would be benificial if it supported other
character sets than US-ASCII.  the code actually tries to do so, but
there is a bug somewhere -- I think the problem is that the strings from
the script are not represented in UTF-8.  I couldn't quite keep track
across all the function pointers, however.

-- 
Kjetil T.




More information about the Info-cyrus mailing list