Translating escape sequences
Richmond
richmondmathewson at gmail.com
Thu Mar 16 04:29:30 EDT 2017
Ouch. My excuse is that I was working with the example you supplied.
Richmond.
On 15/03/17 22:36, J. Landman Gay via use-livecode wrote:
> What if the user name has seven characters after the escape sequence?
>
> On 3/15/17 3:16 PM, Richmond Mathewson via use-livecode wrote:
>> Just knock off the last 3, and what is left is what you want.
>>
>> Richmond.
>>
>> On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote:
>>> The problem with the pseudo code is that there's no clear indication
>>> of how many characters at the end to preserve. I'm not sure how the
>>> libraries deal with that.
>>>
>>> --
>>> Jacqueline Landman Gay | jacque at hyperactivesw.com
>>> HyperActive Software | http://www.hyperactivesw.com
>>>
>>>
>>>
>>> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode
>>> <use-livecode at lists.runrev.com> wrote:
>>>
>>>> No; it won't always be 4 characters, here's an admittedly extremely
>>>> obscure ancient Sinhala number;
>>>> 0x111F4.
>>>>
>>>> Of course the chances of encountering whacky characters like that is
>>>> small, but you'll have to make sure you
>>>> can cope with them should they crop up.
>>>>
>>>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
>>>> after the '\' of the prefix 'u'
>>>> and the suffix 'uls' and then you can cope with whatever is left:
>>>>
>>>> Reasonably pseudo-code following:
>>>>
>>>> set the item delimiter to \
>>>> put what's after the item delimiter into HOLDER
>>>> delete char 1 of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> put "0x" & HOLDER into NUNUM
>>>>
>>>> at this point "NUNUM" could be alost any length, but that should not
>>>> matter unduly.
>>>>
>>>> Richmond.
>>>>
>>>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>>>>> I'm dealing with non-English languages, and JSON data retrieved
>>>>> from a
>>>>> database comes in with unicode escape sequences like this: Eduardo
>>>>> Ba\u00f1uls.
>>>>>
>>>>> I need to translate those. I can do it by replacing the "\u" with
>>>>> "0x"
>>>>> and then using numToCodepoint() to get the UTF16 character. But there
>>>>> could be many of these in the same string, so I'm looking for a
>>>>> one-shot command that might just do them all. I don't think we have
>>>>> one.
>>>>>
>>>>> The alternative is to loop through all the text, getting an offset
>>>>> for
>>>>> each "\u" and then calculating the number of characters after that to
>>>>> use with numToCodepoint(). But will it always be 4 characters in any
>>>>> language?
>>>>>
>>>>> Or is there an easier way?
>>>>>
>>>>
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> use-livecode at lists.runrev.com
>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>>>
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>
More information about the use-livecode
mailing list