Getting the text content of a HTML page
sarah.reichelt at gmail.com
Sun Aug 3 03:56:31 CDT 2008
On Sun, Aug 3, 2008 at 12:31 AM, H Baric <hbaric at gmail.com> wrote:
> Hi again *blush*
> Okay, this is no doubt something very simple even though I've searched through the docs but can't find exactly how to do this seemingly straightforward task:
> * Get the text only from a web page - no html tags, no formatting etc.
> I can get the html doc to appear in my field by using:
> put url "http://www.thePage.com" into thePage
> put thePage into field "The Page"
> (is that correct?) If so, now what? :D
Welcome to the Revolution and please don't feel bad about asking
questions. It's great when people ask beginner level questions as I
think a lot of beginners don't like to ask and so get discouraged.
Your script for getting the contents of a web page is perfect.
For transforming that to plain text, there is a trick which may work
if the web page is not too complex. Try this:
put url "http://www.thePage.com" into thePage
set the htmlText of the templateField to thePage
put the text of the templateField into field "The Page"
More information about the use-livecode