finding repeating patterns

Rob Cozens rcozens at pon.net
Sun Aug 27 13:00:24 EDT 2006


All,

>Does anyone have (or know of) an algorithm to find repeating
>patterns of characters in a string ?

This started out as a mental exercise for moi, and the mouseUp logic 
is untested:

on mouseUp
   put field 1 into sourceString
   put empty into patternList
   put 6 into minPatternLength -- or whatever
   put length(sourceString) div 2 into maxPatternLength  -- pattern 
can't be longer than half the length
   if maxPatternLength < minPatternLength then exit mouseUp
   repeat with patternLength = maxPatternLength down to minPatternLength
       put ((length(sourceString) mod patternLength)-1)*patternLength 
into maxStartPosition
       -- pattern can't start within patterenLength-1 chars from end
       put patternLength - 1 into endAdjustment
       repeat with startingPoint = 1 to maxStartPosition
         set cursor to busy
         put startingPoint+endAdjustment into patternEnd
         put char startingPoint to patternEnd of sourceString into targetPattern
         if targetPattern is among the words of patternList then next 
repeat -- may need stronger checking?
         get the number of lines of offsets(targetPattern,char 
(patternEnd+1) to -1 of sourceString)
         if it > 0 then put targetPattern&&it&return after patternList
       end repeat
     end repeat
     put patternList
end mouseUp

OTOH, working on it led me to the following offsets function, which 
has been tested.

function offsets targetString, sourceString
   put empty into offsetsList
   put 0 into offsetAdjustment
   put length(targetString)-1 into targetLengthAdjustment
   repeat
     get offset(targetString,sourceString)
     if it = 0 then return offsetsList
     put (it+offsetAdjustment) &return after offsetsList
     put it+targetLengthAdjustment into deleteCutoff
     delete char 1 to deleteCutoff of sourceString
     add deleteCutoff to offsetAdjustment
   end repeat
end offsets

Example: offsets("at","The cat in the hat smelled a rat where he sat.") returns

6
17
31
44

Note that the same logic can be applied to create lineOffsets, 
itemOffsets, and wordOffsets functions.

Enjoy!

-- 

Rob Cozens
CCW, Serendipity Software Company

"And I, which was two fooles, do so grow three;
Who are a little wise, the best fooles bee."

from "The Triple Foole" by John Donne (1572-1631)


More information about the use-livecode mailing list