Unicode in variables
J. Landman Gay
jacque at hyperactivesw.com
Mon Aug 19 15:29:57 EDT 2013
On 8/19/13 2:15 PM, Devin Asay wrote:
>
> On Aug 19, 2013, at 1:03 PM, J. Landman Gay wrote:
>
>> I need to read and process a tab-delimited text file that is in
>> UTF8 format containing unicode. The final goal is to get it into an
>> array with the first tabbed item as the keys, preserving all
>> unicode. There are some HTML format tags in it as well.
>>
>> If I read the file as binfile, carriage returns are all lost.
>
> Jacque,
>
> Where are the files coming from? Maybe they're using ASCII 13 as a
> line terminator, or ASCII 10 + 13. Can't you replace whatever the
> native line delimiter is with numToChar(10)?
I forgot about that. They're ascii 13, and replacing them does keep the
line breaks. Thanks.
When I run uniEncode(tData,"UTF8") on it, the high-ascii characters are
in the variable watcher as "+" and an unprintable box. Can I assume the
real character is in there? Will it work for text chunking, etc? When I
split it into an array, will the keys be intact?
--
Jacqueline Landman Gay | jacque at hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
More information about the use-livecode
mailing list