Stripping html tags

Hershel Fisch hershf at rgllc.us
Mon Nov 12 19:12:27 EST 2007


On 11/3/07 3:48 AM, "FlexibleLearning at aol.com" <FlexibleLearning at aol.com>
 wrote:

Hi all, I just paid attention to this thread and I tough if my 2 cents could
help then I'll share it with you.
This is what I use to strip html


function fStripHtml tParam
  replace "&nbsp;" with space in tParam
  replace "<p" with return & "<p " in tParam
  replace "<tr>" with return & "<tr>" in tParam
  replace "&hellip;" with "," in tParam
  replace "&#39;" with "'" in tParam
  replace "&quot;" with quote in tParam
  replace "&rsquo;" with "'" in tParam
  replace "&ldquo;" with quote in tParam
  replace "&rdquo;" with quote in tParam
  replace "&amp;" with "&" in tParam
  replace "<br>" with return in tParam
  replace tab with "" in tParam
  repeat for each line tL in tParam
    put tL into tNl
    put 1 into x
    repeat for (the number of chars in tnl) times
      if char x of tNl =">" then
        put "" into char x of tNl
        put 0 into d
      end if
      if char x of tNl = "<"  then
        put 1 into d
      end if
      if d =1 then 
        put "" into char x of tNl
      else
        add 1 to x
      end if
    end repeat
    --replace space & space & space & space with space in tNl
    replace space & space & space with space in tNl
    --replace space & space with space in tNl
    
    if char 1 in tNl = " " then put "" into char 1 in tNl
    if char -1 in tNl = " " then put "" into char -1 in tNl
    if tNl <> "" then put tNl & return after tNf
    
  end repeat
  return tNf
end fStripHtml

Hope it helps .
Hershel

> 
> This is a seriously detailed stripper, Jim!
> 
> Small error in syntax:
> 
> replace "<td" with  numtochar(160)&"<td" in pHtml
> should be...
> replace "<td"  with numtochar(160)&"td>" in pHtml
> 
> Also, a couple of lines were posted html2Txt-mangled. Could you  clarify:
> -----
> replace "&nbsp;" with space in pHtml
> replace "
> " with return in pHtml
> replace "
> 
> " with return in pHtml
> -----
> 
> If you post the handler as plain text, any html formatted  text should be
> correctly handled by the emailer.
> 
> 
> /H
> 
> -------------------------------
> -------------------------------------------------
> function  StripTags pHtml
> local tRegex,tPrevText
> get   ("&eacute;,&agrave;,&ccedil;")
> get  it &  (",&gt;,&lt;,&ecirc;")
> get  it &  (",&egrave;,&copy;,&#149;")
> get  it &  (",&#39;,&middot;,&amp;")
> -- add more chars if you wish,  then...
> constant kHtml = it
> constant kConvertedHtml =  "é,à,ç,>,<,ê,è,©"
> --using contants means you cannot  accidentally
> --    modify these vars and damage the  results
> -----  
> replace numtochar(13) with empty in  pHtml
> replace tab with empty in pHtml
> replace "<td" with  numtochar(160)&"<td" in pHtml
> -----
> put  replacetext(pHtml,"(?Usi)<SCRIPT.*</SCRIPT>","") into pHtml
> put replacetext(pHtml,"(?Usi)<STYLE>.*</STYLE>","") into  pHtml
> put replacetext(pHtml,"(?Usi)<\?.*\?>","") into  pHtml
> -----
> replace "&nbsp;" with space in  pHtml
> replace "
> " with return in pHtml
> replace "
> 
> " with return in pHtml
> -----
> put   "<[^><]*>" into tRegex
> put replacetext(pHtml,tRegex,"")  into pHtml
> put replacetext(pHtml,tRegex,"") into pHtml
> 
> ----- repeat replacements until there are no changes
> repeat until tPrevText is pHtml
> put pHtml into  tPrevText
> put replacetext(pHtml," +",space) into  pHtml
> put replacetext(pHtml,"^ ","") into pHtml
> end repeat
> -----
> replace (space & return) with return in  pHtml
> replace (return & space) with return in pHtml
> filter pHtml without empty
> replace numtochar(160) with empty in  pHtml
> -----
> replace "&quot;" with quote in  pHtml
> repeat with i = 1 to the number of items of  kHtml
> replace item i of kHtml with item i of  kConvertedHtml in pHtml
> end repeat
> -----
> --put  pHtml into msg  --let's you see the result in the msg box
> return  pHtml
> end StripTags
> 
> 
> Jim Ault
> Las Vegas
> 
> ------------------------------------------------
> --------------------------------
> 
> 
> 
>  
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution



More information about the use-livecode mailing list