Elegant way to express constant UTF8 string in script?

Ben Rubinstein benr_mc at cogapp.com
Mon Jun 30 13:24:24 EDT 2014


Hi Mark,

Thanks for the reply.  The problem is

a) I want to do this purely in script

b) A character directly entered into the script on a Mac comes out different 
on Windows (i.e. the scripts don't know what character set they're in; they're 
simply stored with no indication of character set, and on every platform 
they're interpreted as the supposedly 'native' platform for that character set).

Presumably in 7.0 I won't even need to use normaliseText, because the scripts 
will themselves be stored in Unicode or UTF8, and therefore I can use any 
Unicode character in a real script constant.  But not in 6.x.

Ben

On 30/06/2014 16:09, Mark Schonewille wrote:
> Hi Ben,
>
> The apostrophe doesn't work because you convert to ASCII text that looks different on different platforms. If you don't use unidecode and just set the unicodeText of a field to your Unicode string, it should work. If that's not practical, you could use macToIso() to convert your string to Latin-1.
>
> --
> Kind regards,
>
> Mark Schonewille
> Economy-x-Talk
> Http://economy-x-talk.com
>
> Share the clipboard of your computer over a local network with Clipboard Link http://clipboardlink.economy-x-talk.com
>
>
> Op 30 jun. 2014 om 16:38 heeft Ben Rubinstein <benr_mc at cogapp.com> het volgende geschreven:
>
>> I think this problem should be solved in LC 7 (possibly using normaliseText); but I need a solution that I can ship now (and it's been threatened that LC 7 will 'fix' a 'bug' which isn't, so I'm not sure if I'll ever able to use it).
>>
>> My app processes some data from - and then, re-organised, to - UTF8 text files. Occasionally it needs to insert a constant string; and for various reasons (all of them excellent) I want to specify these constant strings in the script.  So far, so good.  Now however one of these constant strings needs to contain a character which is not in ASCII.  Actually two of them.  So I need to express a UTF8 string in my script.  And I'm searching for an elegant way to do this.
>>
>> My constant string used to look something like this:
>>
>>    constant kMyConstantString = "This is my ice cream"
>>
>> but now it needs to read something like
>>    constant kMyConstantString = "This ice cream is (c) Ben and Jerry's Inc"
>>
>> (only with a smart apostrophe and a proper copyright symbol).
>>
>> I thought I could just about manage with this
>>
>>   put uniDecode(uniEncode("This ice cream is © Ben and Jerry’s Inc, "ANSI"), "UTF8") into kMyConstantString
>>
>> (that is, encode from ANSI to Unicode, then from Unicode into UTF8).
>>
>> I tested it on Mac and it seemed to work.  The UTF8 file was generated and this text came out just right.
>>
>>
>> However, it turned out that when the code was compiled and run on Windows, the copyright symbol came out OK, but the apostrophe came out as o-tilde.
>>
>> This is because uniEncode(..., "ANSI") is a lie; "ANSI" is meaningless; instead it interprets the source encoding as whatever is typical for the operating system.  I wrote the script on Mac; in MacRoman, © is 0xA9 and smart apostrophe is 0xD5; in ISO-8859-1 (and UTF8), 0xA9 is ©, but 0xD5 is o-tilde.
>>
>> So... what's the most elegant way to this (is there one)?  Is there any alternative to just looking up the UTF8 encodings and writing:
>>
>>   put format("This ice cream is \xC2\xA9 Ben and Jerry\xE2\x80\x99s Inc") into kMyConstantString
>>
>> ?
>>
>> TIA,
>>
>> Ben
>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>





More information about the use-livecode mailing list