Stripping html tags

Hershel Fisch hershf at rgllc.us
Mon Nov 12 19:12:27 EST 2007


On 11/3/07 3:48 AM, "FlexibleLearning at aol.com" <FlexibleLearning at aol.com>
 wrote:

Hi all, I just paid attention to this thread and I tough if my 2 cents could
help then I'll share it with you.
This is what I use to strip html


function fStripHtml tParam
  replace " " with space in tParam
  replace "<p" with return & "<p " in tParam
  replace "<tr>" with return & "<tr>" in tParam
  replace "…" with "," in tParam
  replace "'" with "'" in tParam
  replace """ with quote in tParam
  replace "’" with "'" in tParam
  replace "“" with quote in tParam
  replace "”" with quote in tParam
  replace "&" with "&" in tParam
  replace "<br>" with return in tParam
  replace tab with "" in tParam
  repeat for each line tL in tParam
    put tL into tNl
    put 1 into x
    repeat for (the number of chars in tnl) times
      if char x of tNl =">" then
        put "" into char x of tNl
        put 0 into d
      end if
      if char x of tNl = "<"  then
        put 1 into d
      end if
      if d =1 then 
        put "" into char x of tNl
      else
        add 1 to x
      end if
    end repeat
    --replace space & space & space & space with space in tNl
    replace space & space & space with space in tNl
    --replace space & space with space in tNl
    
    if char 1 in tNl = " " then put "" into char 1 in tNl
    if char -1 in tNl = " " then put "" into char -1 in tNl
    if tNl <> "" then put tNl & return after tNf
    
  end repeat
  return tNf
end fStripHtml

Hope it helps .
Hershel

> 
> This is a seriously detailed stripper, Jim!
> 
> Small error in syntax:
> 
> replace "<td" with  numtochar(160)&"<td" in pHtml
> should be...
> replace "<td"  with numtochar(160)&"td>" in pHtml
> 
> Also, a couple of lines were posted html2Txt-mangled. Could you  clarify:
> -----
> replace " " with space in pHtml
> replace "
> " with return in pHtml
> replace "
> 
> " with return in pHtml
> -----
> 
> If you post the handler as plain text, any html formatted  text should be
> correctly handled by the emailer.
> 
> 
> /H
> 
> -------------------------------
> -------------------------------------------------
> function  StripTags pHtml
> local tRegex,tPrevText
> get   ("é,à,ç")
> get  it &  (",>,<,ê")
> get  it &  (",è,©,•")
> get  it &  (",',·,&")
> -- add more chars if you wish,  then...
> constant kHtml = it
> constant kConvertedHtml =  "é,à,ç,>,<,ê,è,©"
> --using contants means you cannot  accidentally
> --    modify these vars and damage the  results
> -----  
> replace numtochar(13) with empty in  pHtml
> replace tab with empty in pHtml
> replace "<td" with  numtochar(160)&"<td" in pHtml
> -----
> put  replacetext(pHtml,"(?Usi)<SCRIPT.*</SCRIPT>","") into pHtml
> put replacetext(pHtml,"(?Usi)<STYLE>.*</STYLE>","") into  pHtml
> put replacetext(pHtml,"(?Usi)<\?.*\?>","") into  pHtml
> -----
> replace " " with space in  pHtml
> replace "
> " with return in pHtml
> replace "
> 
> " with return in pHtml
> -----
> put   "<[^><]*>" into tRegex
> put replacetext(pHtml,tRegex,"")  into pHtml
> put replacetext(pHtml,tRegex,"") into pHtml
> 
> ----- repeat replacements until there are no changes
> repeat until tPrevText is pHtml
> put pHtml into  tPrevText
> put replacetext(pHtml," +",space) into  pHtml
> put replacetext(pHtml,"^ ","") into pHtml
> end repeat
> -----
> replace (space & return) with return in  pHtml
> replace (return & space) with return in pHtml
> filter pHtml without empty
> replace numtochar(160) with empty in  pHtml
> -----
> replace """ with quote in  pHtml
> repeat with i = 1 to the number of items of  kHtml
> replace item i of kHtml with item i of  kConvertedHtml in pHtml
> end repeat
> -----
> --put  pHtml into msg  --let's you see the result in the msg box
> return  pHtml
> end StripTags
> 
> 
> Jim Ault
> Las Vegas
> 
> ------------------------------------------------
> --------------------------------
> 
> 
> 
>  
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution




More information about the use-livecode mailing list