RFC: Charset Conversion Routines

Alexey Melnikov alexey.melnikov at isode.com
Tue Feb 24 06:20:31 EST 2009


Bron Gondwana wrote:

>d) Whitespace compression.  I'm currently mapping all
>   whitespace to ' ' instead of '', and then either stripping
>   all ' ' from the string, or only outputting them if the
>   previous character on the output string was not a space.
>   Rob tells me that there are some issues with asian charsets
>   and space not having any meaning - how best to handle?
>  
>
I think no matter what you do with whitespace compression, it might not 
work for some languages.
So I wouldn't worry too much about this, as long as this procedure is 
optional (or can be controlled by a configuration option or a client).

>e) Interfaces, interfaces, interfaces.  At the moment we have:
>
>* charset_compilepat - for use in:
>  * charset_searchstring
>  * charset_searchfile
>* charset_decode_mimebody - and
>  * charset_encode_mimebody
>* charset_extractfile
>
>My current implementation that I'm working on uses "int flags"
>as an extra parameter to each of these, allowing CHARSET_CANON
>and CHARSET_STRIPSPACE to be passed down to the translation
>layer.
>
This looks sensible.

Another alternative is to implement whitespace compression in another 
function, layered on top of the charset API.

>Would people be happy with that as an interface?  It's
>somewhat invasive, needing changes through lots of imap/*.c and
>sieve/*.c files.
>
>Bron.
>  
>



More information about the Cyrus-devel mailing list