Use of isspace() and friends in Cyrus
David Carter
dpc22 at cam.ac.uk
Thu Mar 19 07:25:56 EDT 2009
I spent some time last night working out why the cyrus.cache entry for a
specific message differed on a master and replica system. An eight bit
character in a To: header (allowed by "reject8bit: no") had been
transformed into a space character in the cache on the replica.
The root cause turned out to be some code in parseaddr_phrase() which
replaces arbitrary sequences of "space" characters with a single space:
./lib/parseaddr.c:
static int parseaddr_phrase(inp, phrasep, specials)
char **inp;
char **phrasep;
char *specials;
{
int c;
char *src = *inp; /* XXX signed char on most platforms */
for (;;) {
c = *src++; /* XXX signed char promoted to signed int */
. . .
else if (isspace(c) || c == '(') {
src--;
SKIPWHITESPACE(src);
*dst++ = ' ';
}
. . .
}
}
isspace(-10) isn't defined.
Worse isspace() is normally implemented as an array lookup and can
segfault with a negative index if you are really unlucky.
A quick search through the code reveals quite a lot of places where char
variables are passed to isspace() without a cast to (unsigned char). The
ctype macros often causes grief, and the normal workaround is to define
replacement macros of the form:
#define Uisspace(c) isspace((int)((unsigned char)(c)))
I'm happy to knock up a sample patch if people agree.
Curiously reconstruct on the replica fixed the cache entry generated by
sync_server, despite the fact both are just using message_parse_file().
I infer that isspace(-10) is normally false but occasionally true, at
least on Solaris x86. It appears to always be false on Linux x86.
--
David Carter Email: David.Carter at ucs.cam.ac.uk
University Computing Service, Phone: (01223) 334502
New Museums Site, Pembroke Street, Fax: (01223) 334679
Cambridge UK. CB2 3QH.
More information about the Cyrus-devel
mailing list