preserve vertical white space in XML?
Sivakatirswami
katir at hindu.org
Wed Jul 13 01:36:42 EDT 2005
Look forward to it... I've already put a request in to our software
acquisitions... you should see a new customer soon.
Since you are working on upgrading...
I would like to see some parameter(s) allow for input and output of
quotes and apostrophes that did not "escape" quotes or apostrophes
(converting to entities). w3Schools.com says only ampersands and "<"
are truly "illegal" and so any xml processor really should escape
these... and they recommend also quotes, apostrophe's and closing
tag marker ">" > also be converted... so, that is what Rev
does... all five of them. And I don't see a way to turn it off.
Despite the vast changes in character sets and encodings and unicode
in our modern world.. I'm still a bit old-fashioned. For certain
things I want the root data to stay really "dumb" i.e. true, original
ASCII 0-126 and nothing more. This way we don't have to face *any*
encoding issues where we want the text to be able to flow, like
water, in the future through a large variety of processors, agents
and platforms, not all of which will be html entity aware--i.e. bring
encoding issues to a virtual zero level, and all the time wasted to
program around them also to virtual zero. Output agents later can
make their own decisions about upgrading source data during process
to curly quotes or apostrophes for a more sophistricated output...
but I don't want these in the source data...I'm currently using
xsltproc with shell cmds from Rev...and having to face all this
because the standard unix libraries being called
libxml 20616, libxslt 10111 and libexslt 809
are so strict...
Then I have to run all these replace "foo" with "bar" in tText" to
get back to where I want to be....
Then, on the complete opposite side of the spectrum, for in-house
publishing RAD tools where the XML processes *are* working with a
goal of more sophisticated outputs, we could really use another
parameter that handles the entire entity set (whatever that is) such
that, for example, M-dash gets output to an entity... and further
more offer a choice for "by name" or by "decimal" i.e. — OR
Ᾱ
Now, some of this is out of the box in terms of rigid XML standards,
but even Adobe has decided to honor verticle white space (char10 or
char13) inside child notes on XML import. Such white space, by spec
of course is to be ignored... but a line break is basic in the word
processing word... so to wipe these is not helpful... OK I guess
that's yet another request. to honor char(10) throughout the
processing in and out. Just pass it...
Am I making sense? Or asking for too much? Maybe this is all handled
transparently with Encoding setting at beginning of the doc... (I
need to go to "Unicode School!")
Sivakatirswami
On Jul 12, 2005, at 2:15 AM, Ken Ray wrote:
>> Ken... just a thought... you might sell more software if you had a
>> full demo that timed out... even if you only gave it 24 hours or 72
>> hours...
>>
>> I generally always want to know what I'm getting. and since the
>> standard ediiton of your XML library is locked... I can't see the
>> scripts.
>>
>
> Thanks, I'll consider it... but the library is kind of a "black
> box", so I
> was consdering people only needing access to the scripts if they were
> wanting to change the default behavior... but as I'm in the middle of
> upgrading the library, I'll think about your suggestion before I
> release the
> next version.
>
> Ken Ray
> Sons of Thunder Software
> Email: kray at sonsothunder.com
> Web Site: http://www.sonsothunder.com/
>
More information about the use-livecode
mailing list