searching for chars within a string

Peter M. Brigham pmbrig at gmail.com
Fri Sep 26 13:53:29 EDT 2014


I'm curious, Larry -- how fast is this on your machine compared to the regex solutions?

function isInString testStr, targetStr
   repeat for each char c in testStr
      add 1 to countArray[c]
   end repeat
   put the keys of countArray into letterList
   repeat for each line L in letterlist
      put countArray[L] into nbrCharsTest
      put howMany(c,targetStr,comma) into nbrCharsNeeded
      if nbrCharsNeeded < nbrCharsTest then return false
   end repeat
   return true
end isInString

function howmany tg,container,divChar
   -- how many tg = <target string> is in container
   
   replace tg with divChar in container
   set the itemdelimiter to divChar
   put the number of items of container into h
   if char -1 of container = divChar then return h
   -- trailing delimiter is ignored
   return h-1
end howmany

-- Peter

Peter M. Brigham
pmbrig at gmail.com
http://home.comcast.net/~pmbrig


On Sep 26, 2014, at 11:58 AM, <larry at significantplanet.org> <larry at significantplanet.org> wrote:

> Hello Kay,
> Good stuff.
> I did some time tests and offset() is about twice as fast as matchText(). Don't know why.
> Larry
> 
> ----- Original Message ----- From: "Kay C Lan" <lan.kc.macmail at gmail.com>
> To: "How to use LiveCode" <use-livecode at lists.runrev.com>
> Sent: Friday, September 26, 2014 12:54 AM
> Subject: Re: searching for chars within a string
> 
> 
>> A simple way would be just to use basic matchText() for each single
>> letter and regex matchText() for repeating letters.  P.*P will find
>> double Ps, P.*P.*P will find triple Ps etc. Seems to be relatively
>> fast but if you have very large data sets other alternatives would
>> need to be investigated.
>> 
>> in the message box:
>> 
>> put "ABCDEKLP" into X
>> put "ABCDEKLLLLMMOOPP" into Y
>> put 100000 into Z
>> put 0 into a
>> put 0 into b
>> put 0 into c
>> put 0 into d
>> put the millisec into tStart
>> repeat Z times
>> if (matchText(X, "A") AND matchText(X, "E") AND matchText(X, "L") AND
>> matchText(X, "P.*P")) then
>>  add 1 to a
>> else
>> add 1 to b
>> end if
>> if (matchText(Y, "A") AND matchText(Y, "E") AND matchText(Y, "L") AND
>> matchText(Y, "P.*P")) then
>> add 1 to c
>> else
>> add 1 to d
>> end if
>> end repeat
>> put the millisec into tEnd
>> put "X Passed " & a & " times." & cr into msg
>> put "X Failed " & b & " times." & cr after msg
>> put "Y Passed " & c & " times." & cr after msg
>> put "Y Failed " & d & " times." & cr after msg
>> put Z & " repeats took " & tEnd - tStart & " ms" after msg
>> 
>> The above should take less than 1 sec but for a million repeats I got:
>> 
>> X Passed 0 times.
>> X Failed 1000000 times.
>> Y Passed 1000000 times.
>> Y Failed 0 times.
>> 1000000 repeats took 1997 ms
>> 
>> NOTE: the above only works if the letters you are looking for can
>> appear in ANY order. If you need a specific order then you'd have to
>> regex matchText() for all searches, ie
>> 
>> if (matchText(X, "A.*E.*L.*P") AND matchText(X,"P.*P")) then
>> 
>> Yes, the P must appear in both searches to ensure a P has both an L
>> before it and a P after it.
>> 
>> HTH
>> 
>> On Wed, Sep 24, 2014 at 8:22 AM,  <larry at significantplanet.org> wrote:
>>> Hello,
>>> 
>>> I have done a lot research and cannot find any way to do this:
>>> 
>>> I have a string, "AELPP" and I want to see if all 5 of those letters are in "search string"
>>> 
>>> If search string is:  ABCDEKLP,   then NO it isn't because there are two P's in the string I'm searching for.
>>> 
>>> But if search string is:  ABCDEKLLLMMOOPP, then YES the string I'm searching for is found in the search string.
>>> 
>>> It is important to my program that I just find the 5 chars anywhere within the search string and they do not have to be sequential in the search string.




More information about the use-livecode mailing list