preserve vertical white space in XML?

Sivakatirswami katir at hindu.org
Wed Jul 13 00:36:42 CDT 2005


Look forward to it... I've already  put a request in to our software  
acquisitions... you should see a new customer soon.

Since you are working on upgrading...

I would like to see some parameter(s) allow for input and output of  
quotes and apostrophes  that did not "escape" quotes or apostrophes  
(converting to entities).  w3Schools.com says only ampersands and "<"  
are truly "illegal"  and so any xml processor really should escape  
these... and they recommend also quotes, apostrophe's and closing   
tag marker ">" >  also be converted... so, that is what Rev  
does... all five of them. And I don't see a way to turn it off.

Despite the vast changes in character sets and encodings and unicode  
in our modern world.. I'm still a bit old-fashioned. For certain  
things I want the root data to stay really "dumb" i.e. true, original  
ASCII 0-126  and nothing more. This way we don't have to face *any*  
encoding issues where we want the text to be able to flow, like  
water, in the future through a large variety of processors, agents  
and platforms, not all of which will be html entity aware--i.e. bring  
encoding issues to a virtual zero level, and all the time wasted to  
program around them also to virtual zero.  Output agents later can  
make their own decisions about upgrading source data during process  
to curly quotes or apostrophes for a more sophistricated output...  
but I don't want these in the source data...I'm currently using  
xsltproc with shell cmds from Rev...and having to face all this  
because the standard unix libraries being called

libxml 20616, libxslt 10111 and libexslt 809

are so strict...

Then I have to run all these replace "foo" with "bar" in tText" to  
get back to where I want to be....

Then, on the complete opposite side of the spectrum, for in-house  
publishing RAD tools where the XML processes *are* working with a  
goal of more sophisticated outputs, we could really use another  
parameter that handles the entire entity set (whatever that is) such  
that, for example, M-dash gets output to an entity... and further  
more offer a choice for "by name" or by "decimal" i.e. —  OR   
Ᾱ

Now, some of this is out of the box in terms of rigid XML standards,  
but even Adobe has decided to honor verticle white space (char10 or  
char13)  inside child notes on XML import. Such white space, by spec  
of course is to be ignored... but a line break is basic in the word  
processing word... so to wipe these is not helpful... OK I guess  
that's yet another request. to honor char(10) throughout the  
processing in and out.  Just pass it...

Am I making sense? Or asking for too much? Maybe this is all handled  
transparently with Encoding setting at beginning of the doc... (I  
need to go to "Unicode School!")

Sivakatirswami



On Jul 12, 2005, at 2:15 AM, Ken Ray wrote:

>> Ken... just a thought... you might sell more software if you had a
>> full demo that timed out... even if you only gave it 24 hours or 72
>> hours...
>>
>> I generally always want to know what I'm getting. and since the
>> standard ediiton of your XML library is locked... I can't see the
>> scripts.
>>
>
> Thanks, I'll consider it... but the library is kind of a "black  
> box", so I
> was consdering people only needing access to the scripts if they were
> wanting to change the default behavior... but as I'm in the middle of
> upgrading the library, I'll think about your suggestion before I  
> release the
> next version.
>
> Ken Ray
> Sons of Thunder Software
> Email: kray at sonsothunder.com
> Web Site: http://www.sonsothunder.com/
>




More information about the use-livecode mailing list