lines of UFT16 text are broken?

Kee Nethery kee at kagi.com
Wed Mar 30 13:24:12 EDT 2011


Looks like when I uniencode(inputText,"UTF8") it makes the don't symbol character into two bytes. Correctly using the uniencode and unidecode functions resolves this issue, thanks!
Kee


On Mar 30, 2011, at 7:10 AM, Kee Nethery wrote:

> 
> On Mar 30, 2011, at 5:58 AM, Trevor DeVore wrote:
> 
>> On Tue, Mar 29, 2011 at 9:04 PM, Kee Nethery <kee at kagi.com> wrote:
>> 
>>> How do people deal with this? Do I need to build a UTF16 version of all the
>>> text parsing routines to safely get each line?
>>> 
>> 
>> Can you iterate over the lines of the UTF8 text and then convert to UTF16
>> when you are done?
> 
> The problem I am running into is that the data is actually a set of items that are delimited by their position on a line.
> 
> 123456789012345678901234567890
> c1data       c2data           c3data
> 
> where (for example) c1data is from character 1 to 6, c2data is from 8 to 17, c3data is from 19 to 30
> 
> The problem is that (in my example) the don't symbol is a single character and they write it into a specific character position. But ... when it gets saved out as UTF8 it takes up 3 characters. I figured that if I converted it to UTF16 I'd at least have a good chance of counting correctly by knowing each character was 2 bytes but alas, that is not how livecode works (at least for me).
> 
> I know, the people who create positional data formats should be <insert bodily harm description here> because of this and other issues, but it's what I have and they aren't changing it any time soon far as I can see.
> 
> Thanks for the suggestion,
> Kee
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




-------------------------------------------------
I check email roughly 2 to 3 times per day. 
Kagi main office: +1 (510) 550-1336







More information about the use-livecode mailing list