remove html tags from text

Cubist at aol.com Cubist at aol.com
Fri Sep 8 20:09:59 EDT 2006


In a message dated 9/8/06 11:40:31 AM, Mark Wieder <mwieder at ahsoftware.net> 
writes:
>Friday, September 8, 2006, 6:10:53 AM, you wrote:
>> barely tested, but maybe a starting point:
>> function striptags tHtml
>>    replace cr with empty in tHtml -- in case of multi-line tags
>>    replace "<" with cr & "<" in tHtml
>>    replace ">" with ">" & cr in tHtml
>>    filter tHtml without "*<*"
>>    filter tHtml without "*>*"
>>    return tHtml
>> end striptags
>Clever... but it'll fail on
>
>if xyz > 4096 then
   No, it won't; not if you're working with an honest-to-God HTML document, 
at least. Greater-than and less-than signs are *only* found *in the HTML 
source*; if you want either of those symbols to show up when someone views your page 
in a browser window, both of them will be HTML entities that start with an 
ampersand and end with a semicolon.

>maybe replace the two filter lines with
>
>   filter tHtml without "<*>"
   I don't think there's any need to go that route. Under what circumstances 
will you ever encounter a document which includes angle-bracketed HTML tags 
*and* leaves honest-to-God angle brackets in their natural, un-Entity-ized state?

-- 
ANTHRO -- http://anthrozine.com
"It's furry. It's the *good* stuff."



More information about the use-livecode mailing list