Searching on RFC2047 headers

OBATA Akio obata at lins.jp
Sat Nov 5 21:19:06 EDT 2011


Hi,

On Sun, 06 Nov 2011 05:26:15 +0900, Ross Boylan <ross at biostat.ucsf.edu> wrote:

> I'm running Cyrus IMAP 2.2.13 on Debian, and am looking for away to
> search on headers that use RFC2047.  I am not able to retrieve anything.
> Here's what happens (using imtest)
> a06 uid fetch 152840 (BODY[HEADER.FIELDS (subject)])
> * 60894 EXISTS
> * 1 RECENT
> * 59873 FETCH (UID 152840 BODY[HEADER.FIELDS (subject)] {72}
> Subject: =?Windows-1251?B?z+7r8/fo8vwgMTEg6vPw8e7iIOHl8e/r4PLt7i4=?=
>
> )
> a07 uid search (HEADER SUBJECT "=?Windows-1251?B?z+7r8/fo8vwgMTEg6vPw8e7iIOHl8e/r4PLt7i4=?=")
> * SEARCH
> a07 OK Completed (0 msgs in 0.140 secs)
> a07 uid search charset us-ascii (HEADER SUBJECT "=?Windows-1251?B?z+7r8/fo8vwgMTEg6vPw8e7iIOHl8e/r4PLt7i4=?=")
> * SEARCH
> a07 OK Completed (0 msgs in 0.140 secs)
> a07 uid search charset windows-1251 (HEADER SUBJECT "=?Windows-1251?B?z+7r8/fo8vwgMTEg6vPw8e7iIOHl8e/r4PLt7i4=?=")
> a07 NO Unrecognized character set
> a07 uid search charset us-ascii (HEADER SUBJECT "z+7r8/fo8vwgMTEg6vPw8e7iIOHl8e/r4PLt7i4")
> * SEARCH
> a07 OK Completed (0 msgs in 0.150 secs)
> a07 uid search charset utf-8 (HEADER SUBJECT "\xcf\xee\xeb\xf3\xf7\xe8\xf2\xfc 11 \xea\xf3\xf0\xf1\xee\xe2 \xe1\xe5\xf1\xef\xeb\xe0\xf2\xed\xee.")
> * SEARCH
> a07 OK Completed (0 msgs in 0.160 secs)
> a07 uid search (HEADER SUBJECT "\xcf\xee\xeb\xf3\xf7\xe8\xf2\xfc 11 \xea\xf3\xf0\xf1\xee\xe2 \xe1\xe5\xf1\xef\xeb\xe0\xf2\xed\xee.")
> * SEARCH
> a07 OK Completed (0 msgs in 0.130 secs)
> a08 uid search (HEADER SUBJECT {63}
> + go ahead
> =?Windows-1251?B?z+7r8/fo8vwgMTEg6vPw8e7iIOHl8e/r4PLt7i4=?=
>
> )
> * SEARCH
> a08 OK Completed (0 msgs in 0.150 secs)
>
> The docs for Cyrus 2.2 do not refer to rfc2047, and so I would expect
> the string to be treated as plain text and for any of my initial
> searches to work.  Obviously they don't.

You should refer RFC 3501 section 6.4.4 for search syntax.
Searched text must be specified by bare text (not MIME encoded one), and its CHARSET is specified.
It is not required that CHASET is same as charset used for MIME encode.

> I've seen references to cyrus canonicalizing search strings in later
> versions
> (http://www.cyrusimap.org/docs/cyrus-imapd/2.4.9/internal/internationalization.php)  I'm not sure what the implications of that for this problem are.

I feel that 2.2.13 just does not support windows-1251 yet.
(I can find it in both 2.3.18 and 2.4.12)

-- 
OBATA Akio / obata at lins.jp


More information about the Info-cyrus mailing list