Strange Entities from htmlText
Sivakatirswami
katir at hindu.org
Sat Jul 23 01:10:29 EDT 2005
I don't really think this is a Rev issue but actually some wierd
Microsoft issue or an email issue?
Some text which originally came from MSWord, is passed to an email
(by cut and paste into mail.app on the mac) and then to a field. and
then output this via the "htmlText" property to an XML document which
is destined to run against a XSLT using xsltProc (run via shell
commands from a Rev UI) This XML file is then urlEncoded as prep for
uploading via POST ... a Rev CGI, gets the POST (engine is Darwin
running on Xserve..) which urlDecodes it and saves it back to an XML
file.. goal being (obviously) that the XML on the server is exactly
the same as was generated by my rev app on the remote client, before
uploading
This system is working really well, btw...until I decided to make use
of the htmlText of that field...
In the original document I am seeing curly quotes and curly
apostrophes... which were pasted into the original input field...
now, my script cleans these up to straight quotes first, and then we
get the htmlText...
htmlText result: [snippet from a complete XML file]
<p>In The Blessings of Children Tiruvalluvar begins by
describing the benefits of having children and states that an
intelligent child is the greatest blessing to the family and is
indeed the family s real wealth.</p>
if I run this thru xsltProc against my style sheet (which is turning
the xml into a .shtml file) these all error out as "unknown
entities... unable to parse /file"
I don't see these entities on BBEdits Entity list... and the other
weird thing is the introduction of a space before the closing quote
or apostrophe...
And we also are seeing another gruesome manifestation:
To foster a sense of self-worth in children, corporal
punishment must be eliminated completely. To think that assaulting a
child--a criminal offense between adults--constitutes discipline, is
virtually insane. Yet, in the US, it is still legal in many states.
Discipline means to teach. The only thing the paddle teaches, is
hatred. This hatred is very often repressed and unconsciously
directed toward self. When this happens, you have crippled a mind for
life.
could the urlEncoding/Decoding be doing something nasty here?
And what is even wierder: if the user manually enters a quote or
apostrophe in the field... we get what we expect to get:
"e;
Any clues?
Sivakatirswami
More information about the use-livecode
mailing list