Re: String encoding questions


Subject: Re: String encoding questions
From: Dom Lachowicz (dominicl@seas.upenn.edu)
Date: Thu Aug 23 2001 - 12:39:52 CDT


> ISO-8859 is used in the GUI, right? What about filenames and internal
> identifiers? Always ASCII?

Currently, ASCII is used for the GUI, filenames, and other internal ids.
Andrew Dunbar had a patch to do this which wasn't fully integrated/complete,
and the work to re-integrate it into the tree will be hard, but I will try.
 
> Hmm. So are you saying that UT_convert doesn't fully support UCS-2 yet?

UT_convert handles any encodings supported by iconv, so UCS-2 will work fine.
 
> In any case, the comment is inaccurate. The comments on to_codeset and
> from_codeset are transposed. The comment says that len=0 is interpreted
> as len=strlen(str), but the code uses strlen only if len<0, which can't
> happen since len is UT_uint32. It also bothers me that iconv returning
> EINVAL and not converthing the whole input is an error if bytes_read_arg
> is NULL but not otherwise. And it bothers me that most of the function
> is in a try block even though only one line of code can throw an
> exception.
>
> I admit, I am picky. ;-) I did finally get it to work, though, so I
> guess I'm OK now.

I will update UT_convert as per how I feel is appropriate. I haven't really
touched it in a while.

Dom



This archive was generated by hypermail 2b25 : Thu Aug 23 2001 - 12:40:00 CDT