Character Encodings and Livecode fields

Richmond richmondmathewson at gmail.com
Sun Jan 26 11:23:27 EST 2014


On 26/01/14 17:09, Graham Samuel wrote:
> The recent discussions under 'Character Encodings' and other related subjects brought me back to some questions:
>
> 1. Is there an actual property of an LC field that can be examined programmatically to show whether the field works as a Unicode string or not? (I know there 'unicodeText', which is a property of a text string, but not actually of a field AFAIKS).
>
> 2. How does the mechanism (which I apparently unearthed using Richmond's little Unicode-querying utility) work whereby a non-Unicode character string appended to a Unicode string in an LC field itself becomes Unicode? Is everything made into two-byte characters? I assume this is the case, but I want to be sure. Experiments with Richmond's utility are confusing - the whole string appears to be Unicode if one explicitly Unicode character is present; but if that character is deleted while the others remain, it seems that the string stops being Unicode - this is scarcely credible, but it's what I seem to be seeing.
>
> 3. If I paste one of the Mac-only 'special' non-straight-ascii characters (Mac-Roman - like the square root character) into a Unicode string, will it end up as the Unicode version of the same symbol? I think not, so some kind of pre-filtering would be needed using platform knowledge before allowing the characters in the field to be parsed as Unicode.
>
> My objective as before is to allow a user to type or paste text into a field from any reasonable source, such as a text processor (any platform), a web page or maybe a Tex document, and for non-ascii characters like pi, square root etc to be included, with the whole field always ending up as Unicode (you can see I'm interested in mathematical stuff, but this could also work for other languages and special characters). The recent discussions make me doubt the feasibility of this. Does anyone know exactly what the mother ship is planning in respect of 'promoting' non-Unicode character strings to Unicode?

I don't know what you mean by "'promoting' non-Unicode character strings 
to Unicode; that sounds a bit odd: as far as I know the ASCII
set is subsumed as the first 255 chars on Unicode so that is neither 
here nor there.

Richmond.

>
> TIA
>
> Graham
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list