perl regex modifiers
Alex Rice
alrice at ARCplanning.com
Sat Jul 26 12:28:00 EDT 2003
On Saturday, July 26, 2003, at 08:58 AM, Mark Brownell wrote:
> Yes I can see that. It will also not work for MTML because a part of
> one element tag set can begin inside of another element tag set and
> end outside of it in MTML. This was a primary issue while defending
> off XML innovators several years ago when I started experimenting with
> it. The PNLP handler is not effected by this problem, and at this
> point it is still the fastest choice. Where I see an advantage is in
> some of MTML's multimedia handling tag sets that could be easier to
> script with perlRegEx.
Mark, I think you are right to look for pros and cons in each method-
however don't sell the regex method short. I don't there is a problem
with it or anything that it's not capable of doing, with the right
pattern. You are judging regex based on only a few examples of a
pattern match. It's certainly possible to handle nested tags and even
overlapping tags: it's just a matter of crafting the right regular
expression.
Tuviah said the patterns are cached so this means speed is not going to
be an issue. Meaning- your regex pattern itself may be fast, or slow,
but calling the regex function will be fast in a loop because RR will
cache the compiled regex pattern.
Lots of people have written have written XML parsers using regular
expressions. I don't know if it's been done with RR, but certainly it
has for Perl, Python and other scripting languages with regex features.
There are lots of Perl modules here maybe you can get some ideas:
http://search.cpan.org/modlist/String_Language_Text_Processing/XML
Some of those listed will be just wrappers around C libraries like
Expat or Xalan, and some will be written in pure Perl with regular
expressions. In particular I think XML::Grove and XML::Parser::Lite use
regex to do their parsing. You could copy their perl regular expression
syntax for use in your project. You will find some SAX-like, DOM-like
and probably some pull-parser like stuff that list.
I say this partially because I don't really understand how the offset
method would be used to parse xml in a general, reusable way :-)
Hope this helps,
Alex Rice, Software Developer
Architectural Research Consultants, Inc.
http://ARCplanning.com
More information about the use-livecode
mailing list