Grep help - remove HTML tags

Ben Rubinstein benr_mc at cogapp.com
Wed Mar 19 08:02:35 EST 2003


on 14/3/03 5:13 pm, Alex Rice wrote

> I agree with the other poster, a real XML parser would be a better way
> to go. What if the HTML is mixed case <TitLE>soemthing</tiTLE>?

Depending on the source of the content - I doubt it!  If the HTML being
parsed comes 'from the wild', there's a 99% chance that it's irregular - not
well-formed HTML, let alone XHTML or XML.

And I'd wouldn't have thought that mixed case would be an issue with
matchText - IIRC, it's case insensitive unless specifically asked not to be.
Whereas XML on the other hand is defined by spec to be case-sensitive; so a
'proper' parser, if behaving 'correctly', should choke on the above example
- whereas Keith would probably rather it accepted it.
 
  Ben Rubinstein               |  Email: benr_mc at cogapp.com
  Cognitive Applications Ltd   |  Phone: +44 (0)1273-821600
  http://www.cogapp.com        |  Fax  : +44 (0)1273-728866





More information about the use-livecode mailing list