Get number of occurrences of one string in another

Kay C Lan lan.kc.macmail at gmail.com
Thu Feb 23 21:09:43 EST 2012


On Fri, Feb 24, 2012 at 3:48 AM, Sumner, Walt <WSUMNER at dom.wustl.edu> wrote:

> Here's a small speed increment, still no repeat loops except for 3 or 4
> hidden within Livecode. Counts "pppppp" as 3 "pp", not 5.
>
> I've had a look at this and generally don't agree that there are only 3
"pp" in "pppppp". I accept that in some situations it might be the answer
you want, but consider a game of Tetris with a 4 block cube (2x2 squares)
and a space 6 squares wide. How many options are there to place it? 5.

So I looked at finding the 'other' answer.

In doing so I discovered that Ken's solution doesn't need the -1, in my
script it gave the right answer with "the number of elements". It was about
5 times slower than Sumner's solution though.

The answer I've currently come up with is very slow, about 14 times slower
than Sumner's, so it would appear that the best option is Sumner's solution
(based on Bob's) for single or multiple non-repeating chars.

Replace "pp" with "p" to check Ken's solution, and "ppppp" for an accurate
speed comparison where my solution gives the same number as Sumner's.

There must be a way to speed up finding the 'other' answer.

on mouseUp
   put empty into msg
    put "b bpbpb bppbppb pp " & \
         "bpppbpppb bppbpppbppppb " & \
         "bpppppbppppbpppbppbp" into tStore
   put "ppppp" into tFind
   put the number of chars in tFind into tFindNum
   put 10000 into tCycles

   put the millisecs into tStartTime
   repeat tCycles times
      put fSumner(tStore,tFind) into tCount1
   end repeat
   put the millisecs into tEndTime

   put "For " & tCycles & " cycles." & cr after msg

   put "Sumner's solution = " & tCount1 & " hits in " & \
         (tEndTime - tStartTime) & "ms" & cr after msg

   --Ken's split only works with single chars
   if (tFindNum = 1) then
      put the millisecs into tStartTime
      repeat tCycles times
         put fKen(tStore,tFind) into tCount2
      end repeat
      put the millisecs into tEndTime
      put "Ken's solution = " & tCount2 & " hits in " & \
            (tEndTime - tStartTime) & "ms" & cr after msg
      if (tCount1 <> tCount2) then
         put "Ken's solution doesn't = Sumner's solution." & cr after msg
      end if
   else --more than one char
      put 1 into tCheck
      repeat with x = 2 to the number of chars of tFind
         if (char x of tFind <> char 1 of tFind) then
            put 0 into tCheck
         end if
      end repeat
      --K's only to be used if it is repeating chars
      if (tCheck = 1) then
         put the millisec into tStartTime
         repeat tCycles time
            put fKay(tStore,tFind,tFindNum) into tCount2
         end repeat
         put the millisec into tEndTime
         put "K's solution = " & tCount2 & \
               " hits in " & (tEndTime - tStartTime) & "ms" & cr after msg
         if (tCount1 <> tCount2) then
            put "K's solution doesn't = Sumner's solution." & cr after msg
         end if
      end if
   end if
end mouseUp

function fSumner pString,pFind
   put length(pString) into tLengthBefore
   replace pFind with empty in pString
   return (tLengthBefore-length(pString)) div length(pFind)
end fSumner

function fKen pString,pFind
         split pString by pFind
         return the number of elements of pString
end fKen

function fKay pString,pFind,pFindNum
   repeat for each char tChar in pString
      put tChar after tCheck
      switch
         case (tCheck = pFind)
            put 1 after tCount
            put char 2 to -1 of tCheck into tCheck
            break
         case (the number of chars of tCheck < pFindNum)
            --don't do anything
            break
         default
            put char 2 to -1 of tCheck into tCheck
      end switch
   end repeat
   return the number of chars of tCount
end fKay



More information about the use-livecode mailing list