cyrus parsing email addresses problem?

Rob Mueller robm at fastmail.fm
Thu Jan 29 18:50:36 EST 2009


I've just come across a user reporting a bug with an email. They have a From 
header that looks like:

From: =?iso-8859-1?Q?SOMETHING:other_text_here?= <email at address.com>

This gets parsed into the envelope:

((NIL NIL "=?iso-8859-1?Q?SOMETHING" NIL)("other_text_here?=" NIL "email" 
"address.com")(NIL NIL NIL NIL))

What's happening is it's treating the SOMETHING: as a email "group" (see 
section 3.4 of RFC 2822 - http://www.faqs.org/rfcs/rfc2822.html)

Looking at RFC 1522 section 5 (3), it replaces the definition of phrase. 
Lets put the relevant parts of RFC2822 + RFC1522 together:

address = mailbox / group
mailbox = name-addr / addr-spec
name-addr = [display-name] angle-addr
group = display-name ":" [mailbox-list / CFWS] ";" [CFWS]
display-name = phrase
phrase = 1*(encoded-word / word)
encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
encoded-text = 1*<Any printable ASCII character other than "?" or SPACE>
word = atom / quoted-string
atom = [CFWS] 1*atext [CFWS]
atext = ALPHA / DIGIT / !#$%&'*+-/=?^_`{|}~

Applying all that to:

=?iso-8859-1?Q?SOMETHING:other_text_here?= <email at address.com>

Is "almost" ambiguous. The one thing that means this should be interpreted 
as a "mailbox" rather than a "group" is that there's no ending ";" in there, 
so the "group" pattern shouldn't match. Wow, this is a pain to parse, 
because basically you have to potentially look way forward (to if there is 
an ending ;) to decide on how to interpret this :(

A "hack" that would probably handle this better would be to alter 
parseaddr_phrase(inp, phrasep, specials) to be aware of RFC1522 type 
encoding, so that when you're in a =?....?= sequence, we ignore the 
"specials" parameter character list.

Anyone have any thoughts?

Rob



More information about the Cyrus-devel mailing list