XML Headaches
David Bovill
david at openpartnership.net
Mon Jul 9 14:10:37 EDT 2007
On 09/07/07, Malte Brill <revolution at derbrill.de> wrote:
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> Works as expected (unless there is more to it)
>
> <?xml version="1.0" encoding="UTF-8"? >
>
> (Mind space before ">" does not. However, the parser does not
> complain and builds the tree. Just it looses data then. Seems like
> having a whole in the bucket where certain chars slip through :-)
This was one of the errors picked up when I ran your xml through:
I also put yur XML through this online validation service and found a bunch
> of errors: http://www.xml.com/pub/a/tools/ruwf/check.html
>
Instead of unidecode(uniencode(myXML,"UTF8"),"ANSII") for the whole
> XML data I have the following script now:
>
> -- Remove byte order mark from UTF8 text
> if charToNum(char 1 of tVar) is 239 then
> if charToNum(char 2 of tVar) is 187 then
> if charToNum(char 3 of tVar) is 191 then
> delete char 1 to 3 of tVar
> end if
> end if
> end if
So you do this for any node data? In other words before adding any data to a
node you should run it over a handler like this?
on xml_SafeEncode @nodeContents
put unidecode(uniencode(nodeContents,"UTF8"),"ANSII") into nodeContents
-- Remove byte order mark from UTF8 text
put numToChar(239) & numToChar(187) & numToChar(191) into testBomHeader
if char 1 to 3 of utf8Text = testBomHeader then
delete char 1 to 3 of tVar
end if
end xml_SafeEncode
More information about the use-livecode
mailing list