Lying in the bath, but telling the truth.

Mark Waddingham mark at livecode.com
Wed Jun 15 12:43:53 EDT 2016


Hi Richmond,

On 2016-06-15 18:27, Richmond wrote:

> So, obviously, I will have to set a "bot" to trawl its way through my 
> code
> and replace every incidence of *numToChar* to *numToCodePoint*, and
> replace the surrogate pairs in the upcoming *Grantha* interface
> with "standard" Unicode addresses. The first of which should (?) be 
> relatively
> simple if the global search-N-replace behaves itself, the second will 
> be a
> bother, but nothing insurmountable.

If all your instances of numToChar are where useUnicode is 'true' then 
you probably *won't* have to do this.

When useUnicode is true, numToChar() works as it always did - it 
produces two bytes which are the binary encoding of the specified 
unicode code unit (not codepoint - see in a minute) as UTF-16.

Now, numToChar() (with useUnicode true) never supported unicode 
codepoints above 65535 - however I think you already figured out how to 
decompose a character outside of the BMP (i.e > 65535) into two 
surrogate pairs which are < 65535 and thus supported by numToChar().

You mention that Devawriter Pro was written against 4.5.x - if I recall 
correctly then this was *before* the field became more intelligent at 
handling unicode. Around 5.5 we changed the field so that it 
*understood* that a unicode code unit (any unicode char <= 65535, 
surrogate pairs are two code units) was a single 'char'. Prior to 5.5, 
the field used 'char' to mean byte (so char 1 of field 1, where the 
first character in a field was a unicode character would return you the 
first byte of code unit, not the code unit itself - which you would get 
with char 1 to 2 of field 1).

This latter fact probably means you will need to spend some time looking 
at the code which manipulates fields as, if you are using 'char' on your 
fields containing unicode and computing indicies thereof (e.g. char 3 to 
4 of field 1), you'll need to adjust for that.

So, to sum up, the changes introduced around 5.5 are likely to cause you 
*more* trouble than those introduced with 7.0 - if you fix your code so 
it works with 5.5 functioning of the field and make sure you put text 
into the field using 'set the unicodeText of <field chunk>' or 'put 
unicode ... into <field chunk>'; then you *should* find that there is 
little or no need to update your unicode construction code - which has 
all the instances of numToChar.

Hope this helps!

Warmest Regards,

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list