LiveCode's handling of Unicode glyphs being dependent on the underlying OS

Mark Waddingham mark at livecode.com
Thu Mar 30 03:50:54 EDT 2017


On 2017-03-29 22:26, Sannyasin Brahmanathaswami via use-livecode wrote:
> One anomaly that appears to be generated by LC 9dp5 running on Sierra
> 10.12.3: Code point U803 maps in the Unicode standard to the Extended
> Latin "H with dot underneath" character.

Just to check, I take it you mean 'h' followed by U803 (the latter is 
'combining underdot' so needs a preceding char to make sense).

> for some bizarre reason, on my machine/system,  Livecode is mapping
> this character to Lucida (I think… possibly Helvetica.)
> So this is an issue with the LC engine…

Indeed, that is odd - and does put Richmond's initial issue slightly 
more under the spotlight particularly as I'm now looking at the issue on 
a 10.6 machine (still haven't had time to upgrade it...)

What I observe (in 8.1 - it happened to be the LiveCode version I had 
opened) is this:

Field's textFont set to Devawriter.

Field containing: 'h' U+803

Displays h-with-underdot glyph - not using Devawriter font.

Field containing: 'h' U+803 ' '

Displays h-with-underdot glyph - uses Devawriter font.

Revisiting the original problem with the 1CF5, 1CF6 and 1CF7 codepoints:

1CF5 on its own - square glyph
1CF5 ' ' - square glyph, then space
1CF5 'a' - VEDIC SIGN JIHVAMULIYA, square glyph

Similar story for 1CF6.

The 1CF7 codepoint always displays the 'undefined codepoint' glyph from 
the last resort font.

Using TextEdit then as long as Devawriter is set as the explicit font, 
1CF5 and 1CF6 seem happy enough to display regardless of chars before or 
after. 1CF7 does the same thing as LiveCode.

*However*, trying h,underdot in TextEdit I observe a worse behavior than 
in LiveCode - the h,underdot never displays in Devawriter font 
regardless of subsequent chars!

So:

   1) The behavior of 1CF7 seems to be because it is an 'unassigned' 
codepoint at this time - I'm not quite sure the exact rationale behind 
not just using the specified font's CMAP table to generate a glyph 
regardless but I suspect it might be to do with unassigned codepoints 
being yet to have any properties which would affect how they are 
processed.

   2) The behavior in LiveCode with regard 1CF5 and 1CF6 looks a lot like 
the behavior with [h,underdot] - where a trailing character is needed to 
make the original character appear in the appropriate font.

I'm pretty sure that (1) is an OS issue and not something we could 
necessarily do much about - it seems to be replacing it with the 
undefined codepoint early on in the rendering process (at the OS level, 
not the engine).

However, (2) does look like it could be an engine issue - indeed, it 
feels like an 'off-by-one' error somewhere in the processing of the runs 
of characters which eventually get passed to CoreText.

That being said, there is different behavior in general between TextEdit 
and LiveCode - which reminds me that I *think* Apple had not yet 
replumbed the text support in Cocoa to use CoreText in 10.6 - that 
happened in a later OS. So, in 10.6 TextEdit is most likely using 
entirely separate Text APIs from LiveCode...

Anyway, I'll take a closer look and see if I can find where the problem 
might be.

Warmest Regards,

Mark.

P.S. In terms of 1CF7 - it still looks like you might have to use a PUA 
char for it to have it work on Mac until it becomes widely supported :(

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list