massive xml docs

Marielle Lange mlange at lexicall.org
Mon Feb 13 08:14:32 EST 2006


Hi Todd,

I don't know how it would behave with a 100MB file but here is the  
method I use to get the data from a particular node of the tree  
without having to construct the full tree.

Best wishes,
Marielle

...
------------------------------------------------------------------------ 
----------------
#
#  Get Tag Content In XML tree
#
# --   @Description: Get the content of a tag, given a path in the  
xml tree
# --                       A tag tree has the following syntax:  
node1:node2:node3:node4
# --                        <node1>...<node2>...<node3>data</ 
node3>...</node2>...</node1>
# --   @Returns:  Content part when <Tag( Params)?> __Content__</ 
Tag>    (text string)

function getTagContent_XML pXMLtext, pTagTree
   IF pXMLtext is empty THEN terminate("BUG x. An empty content was  
given to parse. pXMLtext shouldn't be empty in function  
_getTagContentXML.")
   IF pTagTree is empty THEN terminate("BUG x. No Tag Tree was  
specified. pTagTree shouldn't be empty in function _getTagContentXML.")
   --------
   set the itemdel to ":"
   replace quote with empty in pTagTree    -- This is to get rid of  
quotes in case there is any
   ----------------------
   repeat for each item tTag in pTagTree
     put getTagContent_XML(pXMLtext, tTag) into pXMLtext
   end repeat
   return pXMLtext
end getTagContent_XML
...
function getTagContent pXMLtext, pTagName,
# --   @Requires: swapEOL()  - not included here -- swaps end of  
lines from cr to ¬ to allow for multiline matches with matchtext
# --   @Requires: stripInitialTabs() -- not included here
   put swapEOL(pXMLtext, "remove") into pXMLtext
   if matchtext(pXMLtext, "(?i)<" & pTagName & "[ ]?[^>]*>(.+?)</" &  
pTagName & ">", tTagContent) is false then return empty
   put swapEOL(tTagContent, "restore")  into tTagContent
   put stripInitialTabs(tTagContent) into pXMLtext
   return pXMLtext
end getTagContent
.....


------------------------------------------------------------------------ 
--------
Marielle Lange (PhD),  Psycholinguist

Alternative emails: mlange at blueyonder.co.uk,

Homepage                                                            
http://homepages.widged.com/mlange/
Easy access to lexical databases                    http:// 
lexicall.widged.com/
Supporting Education Technologists              http:// 
revolution.widged.com/wiki/




More information about the use-livecode mailing list