Selecting text using REGEX
Mark Brownell
gizmotron at earthlink.net
Sun Sep 28 11:06:01 EDT 2003
On Friday, September 26, 2003, at 09:16 PM, Bojsza wrote:
> The text is always between "value=question?>" and ends with "<" #the
> quotes are not part of the text
>
> example
>
> value=question?> TEXT I WISH TO PARSE OUT<
>
> Any suggestions would be helpful (I have several hundred lines to
> search through).
Hi Bojsza,
This looks like a case for a pull-parser...
Your code looks like part of a fuller tag set that has the attribute,
"value," that always appears at the end of the start tag.
Example: <grabTag value=question?> TEXT I WISH TO PARSE OUT</grabTag>
but your "value=question?>" fragment could be part of several different
tag sets.
Example: <grabTag value=question?> TEXT I WISH TO PARSE OUT</grabTag>
Example: <findTag value=question?> TEXT I WISH TO PARSE OUT</findTag>
Example: <dumpTag value=question?> TEXT I WISH TO PARSE OUT</dumpTag>
Your tagging system requires a well formed component to it in that
there can't be some other tag ending before your parsing technique
encounters the correct instance of "<".
Example: value=question?> TEXT I WISH TO </b> PARSE OUT<
I would use a pair of offSet() queries to build an array of results.
This array would end up being keyed numerically with the first instance
of the fragmented tag set being keyed as 1.
If it turns out that you are using different full tag set names and
need to tell them from each other then you should add a way to combine
the numerical value and the tag name while keying the array.
pull-parser:
put the text of field "targetText" into tText
put empty into tArray
put 0 into tStart1
put 1 into tElementNum
put the number of chars in "value=question?>" into dChars
repeat
put offset("value=question?>", tText, tStart1) into tNum1
put (tNum1 + tStart1) into tStart1
if tNum1 < 1 then exit repeat
put offset("<", tText, tStart1) into tNum2
put (tNum2 + tStart1) into tStart2
if tNum2 < 1 then exit repeat
put char (tStart1 + dChars) to (tStart2 - 1) of tText into zapped
put zapped into tArray[tElementNum]
add 1 to tElementNum
end repeat
You will get an array, tArray, that is either empty or is filled with
results.
There is probably a regEx way but I have found that this tends to be
faster in most speed tests.
Mark
More information about the use-livecode
mailing list