Anyone got one of these?

Jim Ault JimAultWins at yahoo.com
Fri Jan 26 16:20:39 EST 2007


On 1/26/07 10:56 AM, "Chipp Walters" <chipp at chipp.com> wrote:

> function stripAllTagsBut pHtml,pTagsList
>   --> pTagsList IS A LIST OF TAGS NOT TO EXCLUDE FROM PARSING
>   --> EX. LINE 1 OF pTagsList CAN BE "img" AND LINE 2 CAN BE "b", etc..
> It's used to strip all tags from HTML but those in the pTagsList parameter.
> 
> IOW, it can be used to grab the HTML of a page, and strip everything but the
> img tags.
> 
Do you need them in the sequencial order?
assuming Yes

I would start with the idea of making the 'un-naughty bits' a list to be
used in an optional repeat loop below, then

---------------  start copy here
--Short, fast, sweet.
on test
  put fld 1 into htmlStr --assumes sorce is in fld 1
  replace Null with empty in htmlStr --clean out
  --numtochar(3) works just as well
  replace cr with "†" in htmlStr --preserve

  --repeat for each tag you want to preserve
  replace "<img" with cr&"img" in htmlStr
  
  set the itemDel to ">"
  repeat for each line LNN in (line 2 to -1 of htmlStr)
    put item 1 of LNN & null after newHtmlStr
    put item 2 of LNN & cr after newHtmlStr
  end repeat
  put newHtmlstr into line 2 to -1 of htmlStr
  --end repeat for each tag you want to preserve
  -------------
  --  Now line 2 to last start with "img" and have a null char as the end
tag
  --  You can strip all <tags> then restore the html by
  -------
--Now restore all the protected tags
  replace cr with "<" in htmlStr
  replace null with ">" in htmlStr
  replace "†" with cr in htmlStr
  put htmlStr into fld 2
end test

 -- although this last replace step may not make any sense since HTML pages
don't use cr to format or delimit anything.

This is very fast and protects/restores the targeted tags.

Jim Ault
Las Vegas





More information about the use-livecode mailing list