8-bit characters in headers

Adrian Buciuman impersonala at gmail.com
Mon May 2 15:26:47 EDT 2005


On 4/29/05, Henrique de Moraes Holschuh <hmh at debian.org> wrote:
> On Fri, 29 Apr 2005, Adrian Buciuman wrote:
> > 4. Care should be taken not to rfc2047-encode text which must be
> > ASCII.  Even when properly encoded, non-ASCII is not valid anywhere in
> > headers.
> 
> Huh?  I don't understand.  ASCII is codepoints 0x00-0x7f.  Nothing with an
> 8th bit set could be ASCII.  Also, header names certainly cannot be
> rfc2047-encoded, but the while point of encoding header *content* with
> rfc2047 is to allow *ANY* valid codepoint [subject to the usual no control
> caracters, etc] in the specified charset for content...
> 
Sorry. I've made a mistake in English usage. I mean : " Even when
properly encoded, non-ASCII characters are not valid *everywhere* in
headers."

To quote rfc2047:
"In particular:

   + An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'.

   + An 'encoded-word' MUST NOT appear within a 'quoted-string'.

   + An 'encoded-word' MUST NOT be used in a Received header field.

   + An 'encoded-word' MUST NOT be used in parameter of a MIME
     Content-Type or Content-Disposition field, or in any structured
     field body except within a 'comment' or 'phrase'."

(there are more restrictions in rfc2047)

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html




More information about the Info-cyrus mailing list