Converting hex character references
Alex Tweedly
alex at tweedly.net
Mon Jan 10 16:15:54 EST 2005
Richard Gaskin wrote:
> Converting ISO-8959-1 character references to displayable text is a snap:
>
> If field 1 contains this:
>
> Dont give up & call it quits.
>
> ...I can get the plain text like this:
>
> set the htmlText of fld 2 to the text of fld 1
> get the text of fld 2
>
>
> But what do I do when the data I'm working with contains hex character
> references?:
>
> Don’t give up & call it quits.
>
> I have a bunch of XML files that are UTF-8 encoded and chock full o'
> hex character references like that, and doing a replace on each or
> hunting them down to do a baseConvert would be inefficient.
>
> I'd like to think some combination of Unicode functions/properties
> would do the trick, but alas I'm too braindead to come up with the
> winning solution.
Sorry, I'm clueless about Unicode; noting leaps out of the docs to
suggest itself.
If there isn't a clever Unicode method, you could do the following ....
note it ignores the more complex parts of UTF-*, and deals only with
those chars that can be represented in 2 hex digits ....
It uses replace to do the actual changes - but only does one replace for
each character encoded in the original, so should be pretty fast (NB :
not tested for speed - only for working correctly in simple cases).
on mouseUP
local tText, tArr, tNew, tmp
put the text of field "inField" into tText
put tText into tmp
split tmp by "&" and ";"
put the keys of tmp into tArr
filter tArr with "#x*"
repeat for each line L in tArr
put baseconvert(char 3 to 4 of L, 16, 10) into tNew
replace (char 2 to 4 of L) with tNew in tText
end repeat
put tText after msg
end mouseUp
-- Alex.
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.9 - Release Date: 06/01/2005
More information about the use-livecode
mailing list