htmlText, xHTML and revXML

Jim Ault JimAultWins at yahoo.com
Sat Dec 23 02:12:30 EST 2006


HTML text is a system of tags that signal what item is <start> </end>
whereas XML is much more of an 'outliner' with inheritance defining children
and nodes.  They both have the <> </> look, but HTML is not regimented the
same way except for Tables, Frames, and a few other constructs.

Now if you add in javascript and css, HTML is even less like XML, so the
parent.child relationship is even more remote.

It is hard to imagine a single parser that would work for both.  Perhaps in
special cases that you generate to stay within rules.

Jim Ault
Las Vegas


On 12/22/06 10:17 PM, "David Bovill" <david at openpartnership.net> wrote:

> I am using the script to parse the htmltext of Revs text fields - so it is a
> nice fixed target. Here is the script I have at the moment modified slightly
> from your suggestions to work with anchors:
> 
> function html_ExtractAnchors someHtml
>     put someHtml into htmlPage
>     replace CR with empty in htmlPage --text is now one line
>     replace "name=" with "name=" & CR in htmlPage
>     replace "</a" with "</a" & CR in htmlPage
> 
>     -- filter htmlPage with "*http://*"
>     -- set the itemdel to ">"
>     filter htmlPage with (quote & "*</a")
>     set the itemdel to quote
> 
>     put empty into newLinkList
>     repeat for each line LNN in htmlPage
>         put item 2 of LNN & cr after newLinkList
>         -- put item 1 of LNN & cr after newLinkList
>     end repeat
>     delete last char of newLinkList
>     return newLinkList
> end html_ExtractAnchors
> 
> NB - anyone managed to use  the XML libraries on htmltext - this is the sort
> of thing I mean - which fais with html entities:
> 
> function html_AttributeValues someHtml, attributeName, childName, depth
>     -- does not work with htmlEntities!
> 
>     put revCreateXMLTree(someHtml, true, true, false) into treeID
>     if char 1 to 6 of treeID is "xmlerr" then
>         put someHtml
>         opn_Notify treeID, true
>         exit to top
>     end if
> 
>     if depth is empty then put -1 into depth
>     put revXMLRootNode(treeID) into startNode
>     put revXMLAttributeValues(treeID, startNode, childName, attributeName,
> CR, depth) into attributeValues
>     revDeleteXMLTree treeID
>     return word 1 to -1 of attributeValues
> end html_AttributeValues
> 
> Would be nice...
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution





More information about the use-livecode mailing list