htmlText, xHTML and revXML

Jim Ault JimAultWins at yahoo.com
Sat Dec 23 13:03:10 EST 2006


Unfortunately my experience with the different protocols is very limited.
The world wide conferences try to accommodate all the interested parties
when they publish their standards, but I don't study the rationale they use.

I know part of the rationale is driven by the big companies.  Someone like
Ken Ray can give a good answer.  There are so many flavors of text markup
languages (TML) that have been promulgated for different purposes, I am not
sure there can ever be a standard way of parsing them.

I think that in the beginning a markup language was only for the display of
elements in a 'browser', not an organized data system.  One key part of a
browser program is that if it does not know what to do with a tag, it
silently ignores it rather than producing an error message.  In other words,
errors do not break the page, they result in something displayed poorly or
not at all.

Hopefully someone with real knowledge in this area will chime in.

Jim Ault
Las Vegas


On 12/23/06 8:48 AM, "David Bovill" <david at openpartnership.net> wrote:

> Jim _ thought that was the whole point of xHTML?
> 
> That is that xHTML is HTML that works with XML parsers - that is why you can
> view xHTML outlines in tools such as GoLive. I assumed htmltext from it's
> look was xHTML compliant - ans so always assumd that it would be
> straightforward to parse with the XML tools....
> 
>  The question is where the logic breaks - is it that xMHTML cannot be parsed
> with the XML tools in Rev - or is it that for some crazy reason htmltext is
> not XHTML compliant (ie a subset of xHTML) and therefore alid XML. If the
> latter which I suspect? - what would I need to do to htmltext to make it
> valid XML?
> 
> On 23/12/06, Jim Ault <JimAultWins at yahoo.com> wrote:
>> 
>> HTML text is a system of tags that signal what item is <start> </end>
>> whereas XML is much more of an 'outliner' with inheritance defining
>> children
>> and nodes.  They both have the <> </> look, but HTML is not regimented the
>> same way except for Tables, Frames, and a few other constructs.
>> 
>> Now if you add in javascript and css, HTML is even less like XML, so the
>> parent.child relationship is even more remote.
>> 
>> It is hard to imagine a single parser that would work for both.  Perhaps
>> in
>> special cases that you generate to stay within rules.
>> 
>> Jim Ault
>> Las Vegas
>> 
>> 
>> On 12/22/06 10:17 PM, "David Bovill" <david at openpartnership.net> wrote:
>> 
>>> I am using the script to parse the htmltext of Revs text fields - so it
>> is a
>>> nice fixed target. Here is the script I have at the moment modified
>> slightly
>>> from your suggestions to work with anchors:
>>> 
>>> function html_ExtractAnchors someHtml
>>>     put someHtml into htmlPage
>>>     replace CR with empty in htmlPage --text is now one line
>>>     replace "name=" with "name=" & CR in htmlPage
>>>     replace "</a" with "</a" & CR in htmlPage
>>> 
>>>     -- filter htmlPage with "*http://*"
>>>     -- set the itemdel to ">"
>>>     filter htmlPage with (quote & "*</a")
>>>     set the itemdel to quote
>>> 
>>>     put empty into newLinkList
>>>     repeat for each line LNN in htmlPage
>>>         put item 2 of LNN & cr after newLinkList
>>>         -- put item 1 of LNN & cr after newLinkList
>>>     end repeat
>>>     delete last char of newLinkList
>>>     return newLinkList
>>> end html_ExtractAnchors
>>> 
>>> NB - anyone managed to use  the XML libraries on htmltext - this is the
>> sort
>>> of thing I mean - which fais with html entities:
>>> 
>>> function html_AttributeValues someHtml, attributeName, childName, depth
>>>     -- does not work with htmlEntities!
>>> 
>>>     put revCreateXMLTree(someHtml, true, true, false) into treeID
>>>     if char 1 to 6 of treeID is "xmlerr" then
>>>         put someHtml
>>>         opn_Notify treeID, true
>>>         exit to top
>>>     end if
>>> 
>>>     if depth is empty then put -1 into depth
>>>     put revXMLRootNode(treeID) into startNode
>>>     put revXMLAttributeValues(treeID, startNode, childName,
>> attributeName,
>>> CR, depth) into attributeValues
>>>     revDeleteXMLTree treeID
>>>     return word 1 to -1 of attributeValues
>>> end html_AttributeValues
>>> 
>>> Would be nice...
>>> _______________________________________________
>>> use-revolution mailing list
>>> use-revolution at lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription
>>> preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-revolution
>> 
>> 
>> _______________________________________________
>> use-revolution mailing list
>> use-revolution at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-revolution
>> 
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution





More information about the use-livecode mailing list