perl regex modifiers

Alex Rice alrice at ARCplanning.com
Sat Jul 26 12:28:00 EDT 2003


On Saturday, July 26, 2003, at 08:58  AM, Mark Brownell wrote:

> Yes I can see that. It will also not work for MTML because a part of 
> one element tag set can begin inside of another element tag set and 
> end outside of it in MTML. This was a primary issue while defending 
> off XML innovators several years ago when I started experimenting with 
> it. The PNLP handler is not effected by this problem, and at this 
> point it is still the fastest choice. Where I see an advantage is in 
> some of MTML's multimedia handling tag sets that could be easier to 
> script with perlRegEx.

Mark, I think you are right to look for pros and cons in each method- 
however don't sell the regex method short. I don't there is a problem 
with it or anything that it's not capable of doing, with the right 
pattern. You are judging regex based on only a few examples of a 
pattern match. It's certainly possible to handle nested tags and even 
overlapping tags: it's just a matter of crafting the right regular 
expression.

Tuviah said the patterns are cached so this means speed is not going to 
be an issue. Meaning- your regex pattern itself may be fast, or slow, 
but calling the regex function will be fast in a loop because RR will 
cache the compiled regex pattern.

Lots of people have written have written XML parsers using regular 
expressions. I don't know if it's been done with RR, but certainly it 
has for Perl, Python and other scripting languages with regex features.

There are lots of Perl modules here maybe you can get some ideas:
http://search.cpan.org/modlist/String_Language_Text_Processing/XML

Some of those listed will be just wrappers around C libraries like 
Expat or Xalan, and some will be written in pure Perl with regular 
expressions. In particular I think XML::Grove and XML::Parser::Lite use 
regex to do their parsing. You could copy their perl regular expression 
syntax for use in your project. You will find some SAX-like, DOM-like 
and probably some pull-parser like stuff that list.

I say this partially because I don't really understand how the offset 
method would be used to parse xml in a general, reusable way :-)

Hope this helps,

Alex Rice, Software Developer
Architectural Research Consultants, Inc.
http://ARCplanning.com




More information about the use-livecode mailing list