Translating escape sequences

Richmond richmondmathewson at gmail.com
Thu Mar 16 04:30:26 EDT 2017


Should do.

Richmond.

On 15/03/17 23:03, Mike Bonner via use-livecode wrote:
> does this mean one could replace /u with 0x and then replace uls with empty
> and end up with the correct end result?
>
> On Wed, Mar 15, 2017 at 2:16 PM, Richmond Mathewson via use-livecode <
> use-livecode at lists.runrev.com> wrote:
>
>> Just knock off the last 3, and what is left is what you want.
>>
>> Richmond.
>>
>> On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote:
>>
>>> The problem with the pseudo code is that there's no clear indication of
>>> how many characters at the end to preserve. I'm not sure how the libraries
>>> deal with that.
>>>
>>> --
>>> Jacqueline Landman Gay         |     jacque at hyperactivesw.com
>>> HyperActive Software           |     http://www.hyperactivesw.com
>>>
>>>
>>>
>>> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode <
>>> use-livecode at lists.runrev.com> wrote:
>>>
>>> No; it won't always be 4 characters, here's an admittedly extremely
>>>> obscure ancient Sinhala number;
>>>> 0x111F4.
>>>>
>>>> Of course the chances of encountering whacky characters like that is
>>>> small, but you'll have to make sure you
>>>> can cope with them should they crop up.
>>>>
>>>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
>>>> after the '\' of the prefix 'u'
>>>> and the suffix 'uls' and then you can cope with whatever is left:
>>>>
>>>> Reasonably pseudo-code following:
>>>>
>>>> set the item delimiter to \
>>>> put what's after the item delimiter into HOLDER
>>>> delete char 1 of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> put "0x" & HOLDER into NUNUM
>>>>
>>>> at this point "NUNUM" could be alost any length, but that should not
>>>> matter unduly.
>>>>
>>>> Richmond.
>>>>
>>>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>>>>
>>>>> I'm dealing with non-English languages, and JSON data retrieved from a
>>>>> database comes in with unicode escape sequences like this: Eduardo
>>>>> Ba\u00f1uls.
>>>>>
>>>>> I need to translate those. I can do it by replacing the "\u" with "0x"
>>>>> and then using numToCodepoint() to get the UTF16 character. But there
>>>>> could be many of these in the same string, so I'm looking for a
>>>>> one-shot command that might just do them all. I don't think we have one.
>>>>>
>>>>> The alternative is to loop through all the text, getting an offset for
>>>>> each "\u" and then calculating the number of characters after that to
>>>>> use with numToCodepoint(). But will it always be 4 characters in any
>>>>> language?
>>>>>
>>>>> Or is there an easier way?
>>>>>
>>>>>
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> use-livecode at lists.runrev.com
>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>>
>>>
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list