How to structure HTML text (tags and attributes) for processing in LiveCode?

Keith Clarke keith.clarke at clarkeandclarke.co.uk
Sun Jun 12 06:56:48 EDT 2011


Thanks for the steer Stephen - I have Remo but hadn't discovered Jerry's tutorials before. Much to study there.

The screen-scraping lessons start from the premise that the HTML source is already reasonably structured into lines - for filtering, etc - so it doesn't help with my challenge of getting the page source into the state where I can apply some of these techniques.

However, it did make me think of experimenting with the replace function - replace "/*>" with "/*>" & return in tHTML - to soft-wrap the HTML by tag. 

So, I think I'm on the right track now.
Best,
Keith..

On 12 Jun 2011, at 10:57, stephen barncard wrote:

> Jerry Daniels has an excellent series on screen scraping. Several video
> lessons.
> 
> http://revmentor.com/business-logic-screen-scraping-1
> 
> 
> On 12 June 2011 02:27, Keith Clarke <keith.clarke at clarkeandclarke.co.uk>wrote:
> 
>> Hi folks,
>> Local rainy Saturday night broadband load prevented me from seeing the
>> whole of Colin Holgate's fascinating LiveCode Live presentation on working
>> with web page source HTML text - so I can't wait for the recording!
>> 
>> Meanwhile, I'm trying to extract various html tags and specific attributes
>> from a page's source code - you know, this and that, where <tag>stuff="this"
>> other_stuff="that"</tag>
>> 
>> I'm trying to create the situation where I can iterate through the text
>> using something like 'repeat for each tag' and within that loop, 'repeat for
>> each attribute' - the question is, how to get the source HTML text
>> structured and delimited so that 'HTML tag = line' and 'HTML tag attribute =
>> Item'
>> 
>> Given there are no obvious single character itemDelimiters in HTML and the
>> inefficiency of building-up an algorithm from scratch with chunk functions,
>> are any specialised resources, techniques or tricks available - maybe I
>> missed something in the libURL feature-set?
>> Best,
>> Keith..





More information about the use-livecode mailing list