getting a section cross-platform without utf

Fraser Gordon fraser.gordon at livecode.com
Sat Sep 27 05:45:38 EDT 2014


On 26/09/2014 21:32, Richmond wrote:
> On 26/09/14 23:20, Dr. Hawkins wrote:
>> On Fri, Sep 26, 2014 at 11:19 AM, Richmond <richmondmathewson at gmail.com>
>> wrote:
>>
>>> Put numToChar(167)
>>>
>> I just tried that on a mac.  I think what it gave me was a german esset,
>> the double s that looks like a beta . . .
>>
>>
>
> That makes no sense at all as the Unicode char 'siglum' § is U+00A7
> Decimal 167
>
> While the 'esset' ß is U+00DF Decimal 223
That's the problem with numToChar: it doesn't produce Unicode
characters; it produces a byte with the given value. So instead of
getting U+00A7 SECTION SIGN you get byte 0xA7 which in the MacRoman
encoding is a sharp-S but in Windows codepage 1252 is the correct symbol
(many other bytes with values >127 aren't correct on Windows). To get a
section sign on Mac, you'll need to use numToChar(0xA4).

It seems to work properly on Linux because the ISO-8859-1 text encoding
we use there has all bytes correspond to the first 256 Unicode characters.

And that's why we're deprecating numToChar - it produces text that is
different depending on the platform you run it on. The Unicode
replacement of numToCodepoint will produce the same character regardless
of platform. (Deprecating doesn't mean it is going away - it is still
there; doing the same as it always did. We're just discouraging its use
in any new code).

In case you're wondering, mixing the two is fine - you don't need to
convert all of your code at once. LiveCode correctly converts the output
from numToChar from a native-encoding byte to the corresponding Unicode
character (e.g. doing codepointToNum(numToChar(0xA4)) on a Mac will
produce 0x00A7 because it was mapped to the section sign).

Regards,
Fraser





More information about the use-livecode mailing list