How to filter based on "garbage" subjects ... ?

Rob Siemborski rjs3 at andrew.cmu.edu
Tue Sep 30 12:21:27 EDT 2003


On Tue, 30 Sep 2003, Cyrus Daboo wrote:

> i.e. SIEVE should be decoding the =?euc-kr?.... header into its utf8 form
> BEFORE doing the comparison with the text you provide. i.e. the =B1
> quoted-printable encoded character will have been decoded into the utf8
> representation of that for the euc-kr character set, and thus won't match
> the text you provide. Actually the euc-ky character set is a multibyte
> character set so in fact the unicode character is made up of =B1 and =A4.
> By my reckoning that is the unicode character 0xad11 - I'll leave you to
> work out the utf8 encoding of that!
>
> Basically you are going to have a hard time trying to filter on arbitrary
> unicode characters in some random character set given that sieve expects
> utf8 in its scripts.

Note that there may be some breakage in Cyrus's sieve UTF-8 handling.

At the very minimum:

https://bugzilla.andrew.cmu.edu/show_bug.cgi?id=1710
https://bugzilla.andrew.cmu.edu/show_bug.cgi?id=1725

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper





More information about the Info-cyrus mailing list