Getting the text content of a HTML page
hbaric at gmail.com
Sun Aug 3 05:11:47 CDT 2008
I have to say I feel a bit like a *duhh* dog-paddling around in brain soup
here. But I'm determined, and definitely progressing daily thanks to all the
wonderful readily available documents and examples, as well as all the
friendly live help on demand here. :)
Wow, your script even leaves blank lines between paragraphs (though multiple
blank ones in some cases) - which is what I was trying to do by adjusting
Eric's function, to try and maintain the paragraphs layout somewhat. With no
success! Why doesn't "put return & return" work? LOL
Anyway, cool thanks for that :)
But, just out of interest, is there a way to script "if there are more than
one blank line together, get rid of the extras and just have one" ?
Cheers, Heather who's study strategy is to follow whatever new path is the
most intriguing in the moment and forget what she was doing (but doesn't
care as long as she is learning SOMEthing). :-|
Welcome to the Revolution and please don't feel bad about asking
questions. It's great when people ask beginner level questions as I
think a lot of beginners don't like to ask and so get discouraged.
Your script for getting the contents of a web page is perfect.
For transforming that to plain text, there is a trick which may work
if the web page is not too complex. Try this:
put url "http://www.thePage.com" into thePage
set the htmlText of the templateField to thePage
put the text of the templateField into field "The Page"
use-revolution mailing list
use-revolution at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
More information about the use-livecode