Lying in the bath, but telling the truth.

Richmond richmondmathewson at gmail.com
Wed Jun 15 13:37:43 EDT 2016



On 15.06.2016 19:43, Mark Waddingham wrote:
> Hi Richmond,
>
> On 2016-06-15 18:27, Richmond wrote:
>
>> So, obviously, I will have to set a "bot" to trawl its way through my 
>> code
>> and replace every incidence of *numToChar* to *numToCodePoint*, and
>> replace the surrogate pairs in the upcoming *Grantha* interface
>> with "standard" Unicode addresses. The first of which should (?) be 
>> relatively
>> simple if the global search-N-replace behaves itself, the second will 
>> be a
>> bother, but nothing insurmountable.
>
> If all your instances of numToChar are where useUnicode is 'true' then 
> you probably *won't* have to do this.
>
> When useUnicode is true, numToChar() works as it always did - it 
> produces two bytes which are the binary encoding of the specified 
> unicode code unit (not codepoint - see in a minute) as UTF-16.
>
> Now, numToChar() (with useUnicode true) never supported unicode 
> codepoints above 65535 - however I think you already figured out how 
> to decompose a character outside of the BMP (i.e > 65535) into two 
> surrogate pairs which are < 65535 and thus supported by numToChar().
>
> You mention that Devawriter Pro was written against 4.5.x - if I 
> recall correctly then this was *before* the field became more 
> intelligent at handling unicode. Around 5.5 we changed the field so 
> that it *understood* that a unicode code unit (any unicode char <= 
> 65535, surrogate pairs are two code units) was a single 'char'. Prior 
> to 5.5, the field used 'char' to mean byte (so char 1 of field 1, 
> where the first character in a field was a unicode character would 
> return you the first byte of code unit, not the code unit itself - 
> which you would get with char 1 to 2 of field 1).
>
> This latter fact probably means you will need to spend some time 
> looking at the code which manipulates fields as, if you are using 
> 'char' on your fields containing unicode and computing indicies 
> thereof (e.g. char 3 to 4 of field 1), you'll need to adjust for that.
>
> So, to sum up, the changes introduced around 5.5 are likely to cause 
> you *more* trouble than those introduced with 7.0 - if you fix your 
> code so it works with 5.5 functioning of the field and make sure you 
> put text into the field using 'set the unicodeText of <field chunk>' 
> or 'put unicode ... into <field chunk>'; then you *should* find that 
> there is little or no need to update your unicode construction code - 
> which has all the instances of numToChar.

This is rather interesting as all my code currently features

set the unicodeText of fld "XYZ" to the unicodeText of fld "XYZ" & 
numToChar(12345)

in LC/RR 4.5, to which I should add:

1. That works 100% in LC 4.5

2. I thought that was "the way" in 4.5, so don't entirely understand 
"the changes introduced around 5.5 are likely to cause you *more* 
trouble than those introduced with 7.0".

Having said that, we'll see soon enough if I come-a-cropper or not :)

Richmond.

>
> Hope this helps!

Very much so.
>
> Warmest Regards,
>
> Mark.
>





More information about the use-livecode mailing list