Anyone got one of these?
Jim Ault
JimAultWins at yahoo.com
Fri Jan 26 16:20:39 EST 2007
On 1/26/07 10:56 AM, "Chipp Walters" <chipp at chipp.com> wrote:
> function stripAllTagsBut pHtml,pTagsList
> --> pTagsList IS A LIST OF TAGS NOT TO EXCLUDE FROM PARSING
> --> EX. LINE 1 OF pTagsList CAN BE "img" AND LINE 2 CAN BE "b", etc..
> It's used to strip all tags from HTML but those in the pTagsList parameter.
>
> IOW, it can be used to grab the HTML of a page, and strip everything but the
> img tags.
>
Do you need them in the sequencial order?
assuming Yes
I would start with the idea of making the 'un-naughty bits' a list to be
used in an optional repeat loop below, then
--------------- start copy here
--Short, fast, sweet.
on test
put fld 1 into htmlStr --assumes sorce is in fld 1
replace Null with empty in htmlStr --clean out
--numtochar(3) works just as well
replace cr with "†" in htmlStr --preserve
--repeat for each tag you want to preserve
replace "<img" with cr&"img" in htmlStr
set the itemDel to ">"
repeat for each line LNN in (line 2 to -1 of htmlStr)
put item 1 of LNN & null after newHtmlStr
put item 2 of LNN & cr after newHtmlStr
end repeat
put newHtmlstr into line 2 to -1 of htmlStr
--end repeat for each tag you want to preserve
-------------
-- Now line 2 to last start with "img" and have a null char as the end
tag
-- You can strip all <tags> then restore the html by
-------
--Now restore all the protected tags
replace cr with "<" in htmlStr
replace null with ">" in htmlStr
replace "†" with cr in htmlStr
put htmlStr into fld 2
end test
-- although this last replace step may not make any sense since HTML pages
don't use cr to format or delimit anything.
This is very fast and protects/restores the targeted tags.
Jim Ault
Las Vegas
More information about the use-livecode
mailing list