XML Headaches
David Bovill
david at openpartnership.net
Mon Jul 9 07:48:28 EDT 2007
Is the text actually UTF8 encoded - saying that it contians an an accented e
(é) - and reading docs / doing this by hand may be a bit error prone? The
first thing I'd do is check the XML with a validator and make sure that
works - before looking for bugs?
I've got some documentation with links to the best sources I can find here:
http://handlers.rev-co.de/wiki/XML
This bit would seem relevant:
That means that in a UTF-8 XML document, you cannot simply use a single byte
> with decimal value 233 to represent "�" (and there is no predefined é
> entity as there is in HTML); instead, you must either enter the UTF-8
> multi-byte escape sequence, or use a special kind of XML reference called a
> character reference:
>
> <p>That is everyone's favourite café.</p>
>
> When your text consists primarily of unaccented Roman characters, this is
> often the easiest way to escape the occasional accented or non-Roman
> character. Since "�" appears at position 233 in Unicode (as in ISO-8859-1),
> the XML parser will read the string correctly as "That is everyone's
> favourite caf�."
>
I also put yur XML through this online validation service and found a bunch
of errors: http://www.xml.com/pub/a/tools/ruwf/check.html
Hope this helps.
More information about the use-livecode
mailing list