regex question
J. Landman Gay
jacque at hyperactivesw.com
Sun May 18 13:21:38 EDT 2008
jbv wrote:
> Hi list,
>
> Does anyone have a regex to remove any text between parenthesis
> (including the
> parenthesis) ?
>
> I've tried many options, but none seems to work perfectly...
>
> For instance :
> "\(\)" removes only parenthesis with no text inside
>
> I've also tried this : "\([a-z 0-9 ]+\)", but no luck...
In my RevLive presentation I addressed this kind of processing, and
showed the results of a number of speed tests. It turns out that a
"repeat for each" loop is almost 200 times faster than a regex
expression. My test did something very similar to what you want to do;
in my case, I was removing all html from a web page but it would be just
as easy to substitute parentheses for the "<" and ">" characters I was
looking for.
The key is using the offset function along with its "skip" paramenter to
find the first character (left parentheses in your example), then
getting the offset of the second character (right parentheses) and
extracting the data around it.
Here is my test example which should be easy for you to modify:
function removeRepeat pData
repeat for each line l in pData
put 0 into tSkip
repeat
put offset("<",l,tSkip) into tStart
if tStart = 0 then exit repeat
put char tSkip+1 to tSkip+tStart-1 of l & space after tNewData
add tStart to tSkip
put offset(">",l,tSkip) into tEnd
if tEnd = 0 then exit repeat
add tEnd to tSkip
end repeat
put cr after tNewData
end repeat
filter tNewData without empty
return tNewData
end removeRepeat
--
Jacqueline Landman Gay | jacque at hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
More information about the use-livecode
mailing list