Finding matched parentheses
Peter M. Brigham
pmbrig at gmail.com
Tue Jul 30 09:52:27 EDT 2013
Here is a function generalized to work for any pair, building on dfepstein's function. I *think* it works, did limited testing. Try to find errors please. I'm curious to see benchmarking on this with long strings, if it works.
-- Peter
Peter M. Brigham
pmbrig at gmail.com
http://home.comcast.net/~pmbrig
------------------
function offsetPair str,tIndex,a,b -- str cannot be more than one line
-- returns the offset of b matching the occurrence of a at char tIndex in str
-- if (char tIndex of str) <> a then returns empty -- error
-- if a or b are not in str then returns 0
-- if tIndex = empty then assumes the first occurrence of a
-- if a or b = empty then assumes parentheses search
if a = empty then put "(" into a
if b = empty then put ")" into b
put offsets(a,str) into openParens
put offsets(b,str) into closeParens
if openParens = 0 then return 0
if closeParens = 0 then return 0
if tIndex = empty then return firstOffsetPair(a,b,str)
if tIndex is not among the items of openParens then return empty
-- char tIndex of str <> a
put howmany(a,char 1 to tIndex-1 of str) into openBefore
put length(char 1 to tIndex-1 of str) into lengthBefore
put char tIndex to -1 of str into workingStr
put item -openBefore of closeParens into tempCloseIndex
delete char tempCloseIndex to -1 of workingStr
put firstOffsetPair(a,b,workingStr) into theCloseIndex
return tIndex,(theCloseIndex + lengthBefore)
end offsetPair
function firstOffsetPair a,b,str
-- from dfepstein
-- str cannot be more than one line
-- returns first instance of char a && "matching" instance of char b in str, or 0 if no a or empty if no match
put offset(a,str) into ca
if ca = 0 then return 0
put numToChar(7) into char 1 to ca of str
set lineDelimiter to a
set itemDelimiter to b
repeat with i = 1 to the number of items in str
if i = the number of lines in item 1 to i of str then return ca+length(item 1 to i of str)
end repeat
return empty
end firstOffsetPair
function offsets str,container,includeOverlaps
-- returns a comma-delimited list of all the offsets of str in container
-- returns 0 if not found
-- third param is optional:
-- offsets("xx","xxxxxx") returns "1,3,5" not "1,2,3,4,5"
-- ie, by default, overlapping offsets are not counted
-- if you want overlapping offsets then pass "true" in 3rd param
if str is not in container then return 0
if includeOverlaps = empty then put false into includeOverlaps
put empty into offsetList
put 0 into startPoint
repeat
put offset(str,container,startPoint) into thisOffset
if thisOffset = 0 then exit repeat
add thisOffset to startPoint
put startPoint & comma after offsetList
if not includeOverlaps then
add length(str)-1 to startPoint
end if
end repeat
return item 1 to -1 of offsetList -- delete trailing comma
end offsets
function howmany tg,container
-- how many tg = <target string> is in container
-- note that howmany("00","000000") returns 3, not 5
-- if you want to allow overlapping matches, use:
-- number of items of offsets(tg,container,"true")
-- (see offsets() function)
-- requires getDelimiters()
put getDelimiters(container) into divChar
replace tg with divChar in container
set the itemdelimiter to divChar
put the number of items of container into h
if char -1 of container = divChar then return h
-- trailing delimiter is ignored
return h-1
end howmany
function getDelimiters tText,nbr
-- returns a cr-delimited list of <nbr> characters
-- not found in the variable tText
-- use for delimiters for, eg, parsing CSV files
-- usage: put getDelimiters(CSVtext,2) into tDelims
-- put line 1 of tDelims into lineDivider
-- put line 2 of tDelims into itemDivider
-- etc.
if nbr = empty then put 1 into nbr -- default 1 delimiter
put "2,3,4,5,6,7,8" into baseList
-- could use other non-printing ASCII values
put the number of items of baseList into maxNbr
if nbr > maxNbr then return "Error: max" && maxNbr && "delimiters."
repeat with tCount = 1 to nbr
put true into failed
repeat with i = 1 to the number of items of baseList
put item i of baseList into testNbr
put numtochar(testNbr) into testChar
if testChar is not in tText then
-- found one, store and get next delim
put false into failed
put testChar into line tCount of delimList
exit repeat
end if
end repeat
if failed then
put the number of lines of delimList into nbrFound
if nbr = 1 then
return "Cannot get delimiter!"
else if nbrFound = 0 then
return "Cannot get any delimiters!"
else
return "Can only get" && nbrFound && "delimiters!"
end if
end if
delete item i of baseList
end repeat
return delimList
end getDelimiters
----------------------
On Jul 29, 2013, at 4:15 PM, DunbarX at aol.com wrote:
>
> Hmmm.
>
>
> I read the original post as finding the "closing parenthesis ... of a pair"
>
>
> "Any pair?"
>
>
> This seems to indicate that all nested parens have to be parsed as a whole. If you want to find a particular
> related couple, you have to use the correct ordered pair derived from the function.
>
>
> It is much simpler to find the first ")" and work backward. But that would preclude nested parens, since the
> firstmost and innermost pair would be the only one found. You cannot find that kind of first "pair" and also
> include nested pairs. That first pair is always minimally small.
>
>
> The function could be easily modified to list all pairs within any designated pair, of course.
>
>
>
> Anyway, it was fun to play with.
>
>
> Craig
>
>
> -----Original Message-----
> From: Peter Haworth <pete at lcsql.com>
> To: How to use LiveCode <use-livecode at lists.runrev.com>
> Sent: Mon, Jul 29, 2013 1:15 pm
> Subject: Re: Finding matched parentheses
>
>
> On Mon, Jul 29, 2013 at 9:07 AM, <dunbarx at aol.com> wrote:
>
>> I tired your script on the string:
>>
>>
>>
>> aa(ss)(xx)(yy)
>>
>>
>>
>> it only returned the parens bracketing "ss"
>>
>
> I Think that's what he wants to do - just find the position of the first
> set of parentheses, taking nested parens into account. But not sure.....
>
> Personally, I'd use the regex that Thierry posted a couple of days back.
> No recursion involved and one line of code does the job.
>
> Pete
> lcSQL Software <http://www.lcsql.com>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list