RegEx Help--Across Lines

Ken Ray kray at sonsothunder.com
Sun Nov 21 01:43:45 EST 2004


On 11/20/04 8:20 PM, "Sivakatirswami" <katir at hindu.org> wrote:

> I am using Rev to repurpose old html to new CSS compliant mark up. The
> old pages are incredibly inconsistent.  Fortunately grep is our
> friend.. I need a grep expression that will pass out the content from
> 
> both #1:
> 
> <title> some title </title>
> 
> and #2
> 
> <title> some title
> </title>
> 
> where the first instance has no line break but the second one does

Use the "(?s)" directive:

on mouseUp
  local tTitle
  put "<title>some title"&cr&"</title>" into tXML
  get matchText(tXML,"(?s)title>(.*?)</title",tTItle)
  put tTitle
end mouseUp

Note that you'll get the trailing CR after "some title" as well, so you'd
have to strip that out if you want to.

Check the docs at http://www.prce.org/man.txt - the "?s" directive
corresponts to PCRE_DOTALL, which causes the "." character to match all
characters, including newlines (CRs).

HTH,

Ken Ray
Sons of Thunder Software
Web site: http://www.sonsothunder.com/
Email: kray at sonsothunder.com




More information about the use-livecode mailing list