Lying in the bath, but telling the truth.
Richmond
richmondmathewson at gmail.com
Wed Jun 15 13:37:43 EDT 2016
On 15.06.2016 19:43, Mark Waddingham wrote:
> Hi Richmond,
>
> On 2016-06-15 18:27, Richmond wrote:
>
>> So, obviously, I will have to set a "bot" to trawl its way through my
>> code
>> and replace every incidence of *numToChar* to *numToCodePoint*, and
>> replace the surrogate pairs in the upcoming *Grantha* interface
>> with "standard" Unicode addresses. The first of which should (?) be
>> relatively
>> simple if the global search-N-replace behaves itself, the second will
>> be a
>> bother, but nothing insurmountable.
>
> If all your instances of numToChar are where useUnicode is 'true' then
> you probably *won't* have to do this.
>
> When useUnicode is true, numToChar() works as it always did - it
> produces two bytes which are the binary encoding of the specified
> unicode code unit (not codepoint - see in a minute) as UTF-16.
>
> Now, numToChar() (with useUnicode true) never supported unicode
> codepoints above 65535 - however I think you already figured out how
> to decompose a character outside of the BMP (i.e > 65535) into two
> surrogate pairs which are < 65535 and thus supported by numToChar().
>
> You mention that Devawriter Pro was written against 4.5.x - if I
> recall correctly then this was *before* the field became more
> intelligent at handling unicode. Around 5.5 we changed the field so
> that it *understood* that a unicode code unit (any unicode char <=
> 65535, surrogate pairs are two code units) was a single 'char'. Prior
> to 5.5, the field used 'char' to mean byte (so char 1 of field 1,
> where the first character in a field was a unicode character would
> return you the first byte of code unit, not the code unit itself -
> which you would get with char 1 to 2 of field 1).
>
> This latter fact probably means you will need to spend some time
> looking at the code which manipulates fields as, if you are using
> 'char' on your fields containing unicode and computing indicies
> thereof (e.g. char 3 to 4 of field 1), you'll need to adjust for that.
>
> So, to sum up, the changes introduced around 5.5 are likely to cause
> you *more* trouble than those introduced with 7.0 - if you fix your
> code so it works with 5.5 functioning of the field and make sure you
> put text into the field using 'set the unicodeText of <field chunk>'
> or 'put unicode ... into <field chunk>'; then you *should* find that
> there is little or no need to update your unicode construction code -
> which has all the instances of numToChar.
This is rather interesting as all my code currently features
set the unicodeText of fld "XYZ" to the unicodeText of fld "XYZ" &
numToChar(12345)
in LC/RR 4.5, to which I should add:
1. That works 100% in LC 4.5
2. I thought that was "the way" in 4.5, so don't entirely understand
"the changes introduced around 5.5 are likely to cause you *more*
trouble than those introduced with 7.0".
Having said that, we'll see soon enough if I come-a-cropper or not :)
Richmond.
>
> Hope this helps!
Very much so.
>
> Warmest Regards,
>
> Mark.
>
More information about the use-livecode
mailing list