Unicode in variables

J. Landman Gay jacque at hyperactivesw.com
Mon Aug 19 15:29:57 EDT 2013


On 8/19/13 2:15 PM, Devin Asay wrote:
>
> On Aug 19, 2013, at 1:03 PM, J. Landman Gay wrote:
>
>> I need to read and process a tab-delimited text file that is in
>> UTF8 format containing unicode. The final goal is to get it into an
>> array with the first tabbed item as the keys, preserving all
>> unicode. There are some HTML format tags in it as well.
>>
>> If I read the file as binfile, carriage returns are all lost.
>
> Jacque,
>
> Where are the files coming from? Maybe they're using ASCII 13 as a
> line terminator, or ASCII 10 + 13. Can't you replace whatever the
> native line delimiter is with numToChar(10)?

I forgot about that. They're ascii 13, and replacing them does keep the 
line breaks. Thanks.

When I run uniEncode(tData,"UTF8") on it, the high-ascii characters are 
in the variable watcher as "+" and an unprintable box. Can I assume the 
real character is in there? Will it work for text chunking, etc? When I 
split it into an array, will the keys be intact?

-- 
Jacqueline Landman Gay         |     jacque at hyperactivesw.com
HyperActive Software           |     http://www.hyperactivesw.com




More information about the use-livecode mailing list