Regex to remove all tags from a web page

Eric Chatonet eric.chatonet at
Mon Oct 31 06:19:43 EST 2005

Hi all,

I searched the list archive and the net for a regex that would allow  
to retrieve the meaningful text from any web page, stripping all html  
tags, extra code, etc. but I did not find something really  
convincing :-(
Any help would be much appreciated :-)

PS. I don't want to use "set the htmlText/get text" using a field:  
this way crashes Rev unpredictably when doing batch processing.

Best Regards from Paris,

Eric Chatonet.
So Smart Software

For institutions, companies and associations
Built-to-order applications: management, multimedia, internet, etc.
Windows, Mac OS and Linux... With the French touch

Free plugins and tutorials on my website
Web site
Email        eric.chatonet at
Phone        33 (0)1 43 31 77 62
Mobile        33 (0)6 20 74 50 86

More information about the Use-livecode mailing list