JimAultWins at yahoo.com
Thu Dec 1 17:48:17 EST 2005
Mark, try uppercase U, like the following:
(?U) ungreedy (shortest possible match), case-sensitive, on the same line.
(?Usi) means ungreedy, ignore end-of-line, ignore case
(?Ui) or (?iU) ungreedy, on the same line, ignore case.
Your example should be:
> matchChunk(tRawText, "(?Usi)<B>(.+)</B>", tPos, tEnd)
use the 'si' part for the times when the HTML is published with the <B> and
</B> on separate lines and not always upper case.
Thus (?si) is useful to grab the most text and forget case, such as
(?si)<table>(.*)</TABLE> == everything between first to last, in order to
get all the nested table data
(?si)chapter 1(.*)chapter 2 everything between the headings, whether or not
'chapter' is proper/upper/lower case.
On 12/1/05 1:37 PM, "Mark Wieder" <mwieder at ahsoftware.net> wrote:
> Is there a way to tell matchChunk() not to be greedy?
> I've got a regular expression stumper here. I'm using regex with the
> matchChunk() function and I've BZed a problem with it. Now I'm looking
> for a workaround that still manages to use matchChunk().
> MatchChunk() is apparently implementing regex's "greedy" mode, so a
> call to matchChunk() returns everything between the first <B> tag and
> the *last* </B> tag. So...
> put "<B>hello, bucko</B> this is a <B>test</B>"
> matchChunk(tRawText, "(?s)<B>(.+)</B>", tPos, tEnd)
> returns "hello, bucko</B> this is a <B>test"
> What I'd like to return is just "hello, bucko". Is there some regex
> incantation that will do this for me? I can't seem to come up with it.
More information about the Use-livecode