Getting the text content of a HTML page

Scott Morrow scott at elementarysoftware.com
Sun Aug 3 06:17:10 EDT 2008


Hello Heather,

Mark's use of filter answered your question nicely.

This chunking method illustrates another, less elegant, way of doing  
what your pseudo-code below describes.


   put field "theField" into tText
   replace "<" with (CR&"<") in tText
   replace ">" with (">"&CR) in tText

   -- build a new list without the "<" or ">" lines
   repeat for each line tLine in tText
     if ("<" is not in tLine) AND (">" is not in tLine) then
       put tLine &CR after tNewText
     end if
   end repeat
   put tNewText




Scott Morrow

Elementary Software
(Now with 20% less chalk dust!)
web       http://elementarysoftware.com/
------------------------------------------------------


On Aug 3, 2008, at 1:06 AM, H Baric wrote:

> Okay, can someone please tell me if this (or something remotely like  
> it)
> would be possible (if script was written correctly, which I can't  
> for the
> life of me work out how):
>
> put url "somewebpage" into field "theField"
>
> put return before all the "<"
>
> put return after all the ">"
>
> replace lines containing "<" and ">" with empty
>
>
>
> HAHA, I know it's not anywhere near the solution to extracting text  
> from
> HTML (Eric's function does a wayyy better job of that), but I'd LOVE  
> to know
> how to actually write the script if not anything else, because it's  
> driven
> me crazy for about two hours now! Grrr
>
> If it does something wonderful that would be great too :P
>
> Thanks so much  :)
>
> Heather
>
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your  
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution








More information about the use-livecode mailing list