Help converting Hex UTF-8 bytes to character

Paul Dupuis paul at researchware.com
Thu May 31 16:39:36 EDT 2018


As a general approach:

1) use offset() looking for "\x" (or you could use regex) to find the start
2) if the value returned by offset is not zero (call it tOS) put char
tOS+2 to tOS+2 into tByte1 and char tOS+6 to tOS+7 into byte2 to get the
2 hex values
3) use the formula put
baseConvert(byte1,16,10)*256+baseconvert(byte2,16,10) into tCodePoint
4) lastly put numToCodepoint(tCodePoint) into char tOS to tOS+7 of the
original string

Off the top of my head and (obviously) not tested.


On 5/31/2018 4:13 PM, Trevor DeVore via use-livecode wrote:
> Hi,
>
> I have a text file that contains Hex UTF-8 bytes encode in the following
> manner:
>
> ```
> \xC3\xB3
> ```
>
> This particular sequence represents the following character:
>
> ```
> ó
> ```
>
> I need to read this file in, converting these hex bytes to the proper
> character. For example, the following string:
>
> ```
> versi\xC3\xB3n HTML5
> ```
>
> should be read in as:
>
> ```
> versión HTML 5
> ```
>
> Does anybody know how to use the C3 B3 hex values to generate the proper
> character?
>
> Thanks,
>





More information about the use-livecode mailing list