How to structure HTML text (tags and attributes) for processing in LiveCode?

Björnke von Gierke bvg at mac.com
Sun Jun 12 09:50:57 EDT 2011


nbsp means a nonbreaking space. most html renderer remove double spaces, for historical reasons as far as i know. thus the nbsp was introduced, and can appear anywhere in a text, most often to do basic indentation. however, filter only works on full lines, and is thus not helpful with that. you should use 'replace' for single occurrences of strings, that can't appear in normal text:

replace " " with space in theHTML

On 12 Jun 2011, at 15:42, Keith Clarke wrote:

> Thanks for the insights Jim (and Stephen) - all very useful.
> A list of stuff is now emerging from the depths of the page. The only problem I have now is some stubborn ' ' characters that don't respond to filtering without " " or numToChar(160).
> Any ideas?
> Best,
> Keith..
> 
> On 12 Jun 2011, at 14:18, Jim Ault wrote:
> 
>> I forgot to mention the old frames style if you are looking into archives on old sites,
>> and <IFRAME> on newer sites, easy to detect, but now you have a second <head> </head> <body> </body>.
>> 
>> On Jun 12, 2011, at 4:14 AM, Keith Clarke wrote:
>> 
>>> I've got the HTML source into a reasonable shape for processing with line and item chunk expressions by using:
>>> 
>>> put field "fld Page Source Code" into tHTML
>>> replace "/div>" with "/div>" & return in tHTML
>>> replace "/tr>" with "/tr>" & return in tHTML
>>> replace "/td>" with "/td>" & tab in tHTML
>>> filter tHTML with <strings that isolate only the interesting, data-laden table rows>
>>> 
>>> So, I can now have line-level chunk expressions mapped to divs and table row tags, together with item-level expressions for iterating through the tags and their attributes within table rows. Nice!
>>> 
>>> Now the rich seams have been revealed, it's time to start digging out them there nuggets! :-)
>> 
>> Jim Ault
>> Las Vegas
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode





More information about the use-livecode mailing list