Automatic content-type insertion

Rob Mueller robm at fastmail.fm
Sun Dec 1 18:43:23 EST 2002


I'm just wondering why cyrus automatically adds a content type charset to
every message, even if none is specified in the message itself. For example:

* 119 FETCH (FLAGS (\Recent \Seen) RFC822 {...}
To: robm at fastmail.fm
From: test at fastmail.fm
Subject: Nothing much
Message-Id: <20021201233319.9F3BA3E816 at server2.fastmail.fm>
Date: Sun,  1 Dec 2002 18:33:19 -0500 (EST)

Just a little text

)
. OK Completed
. fetch 119 bodystructure
* 119 FETCH (BODYSTRUCTURE ("TEXT" "PLAIN" ("CHARSET" "us-ascii") NIL NIL
"7BIT" 22 2 NIL NIL NIL))
. OK Completed

So there's no "Content-Type" line in the message, but the bodystructure has
given it an implicit charset of us-ascii. Now, I know that this is
technically true, but unfortunately, there seem to be quite a few broken
iso-2022-jp messages out there which don't actually specify the charset in
the header. What we allow on our site is a 'default charset', which is used
if no charset is available, which would work fine in this situation.
Unfortunately in this case, there's no indication that the ("CHARSET"
"us-ascii") response was auto-generated, rather than explicitly set.

The main solutions I see are:
1. Remove the implicit setting of the charset if none supplied
2. us-ascii is a subset of most encodings anyway, so always allow overriding
of a us-ascii charset anyway
3. Do a fetch body[header.fields (Content-Type)] to see if one actually
exists

I was wondering if other people had seen ways to deal with this...

Rob





More information about the Info-cyrus mailing list