Decoding "quoted-printable" -- Help needed -- Reopened
R.H.
roland.huettmann at gmail.com
Thu Nov 14 14:41:22 EST 2019
Oh, sorry, I was too quick declaring a solution.
Even though the code of the function works fine, the result also converts
back, but the "quoted-printable" or "UTF-8" code expects that each
codepoint is encoded in Hex with just two ASCII letters representing a
codepoint.
For example, for the Euro symbol "€" we have three codepoints.
The function below converts to "=E2=201A=AC" while it must be "=E2=82=AC".
The "=" sign is just a delimiter in quoted-printable.
Now, I do not know what is wrong in my thinking as I am not getting quite
the same results.
(The result is ok for other symbols such as 'ü'.)
EXAMPLE:
put "€" into tChar
// First encode to UTF-8:
put textEncode(tChar,"UTF-8") into tCodedChar
// Repeat for each codepoint in the UTF-8 char
repeat for each codePoint tCodePoint in tCodedChar
// Encode each codepoint to its integer expression and convert to
Hex value:
put "="& BaseConvert ( codePointToNum (tCodePoint) , 10 , 16 ) after
tEncoded
end repeat
put tEncoded into field "Show Codepoints" -- Expected ASCII representing
Hex numbers
-- Result: "=E2=201A=AC" -- Instead of "=E2=82=AC" , but valid and working.
The actual "correct" UTF-8 result can be tested here:
http://www.endmemo.com/unicode/unicodeconverter.php
What am I missing?
Thanks a lot
Roland
More information about the use-livecode
mailing list