permuting a string (was Re: Speed)
Geoff Canyon
gcanyon at gmail.com
Mon Sep 1 20:35:31 EDT 2014
I have a set of code that seems to do the trick. It takes as an argument
the number of each element to permute. So for your examples:
On Mon, Sep 1, 2014 at 10:32 AM, Beat Cornaz <B.Cornaz at gmx.net> wrote:
> On my computer :
> Input : 111222333456 8320 mSec
> Input : 111112222233 58 mSec
>
It takes "3,3,3,1,1,1" because there are 3 ones, 3 twos, 3 threes, 1 four,
1 five, and 1 six.
And "5,5,2" because there are 5 ones, 5 twos, and 2 threes.
Like my previous routine, it takes an argument for the ASCII value to start
with, but note that duplicates take ASCII values during processing, so the
actual results from these arguments would be permutations of 111444777:;<
and 1111166666;;
My routine works the way I described:
1. Get the permutations of all the duplicates, using Alex's serialpermut (I
think I made minor modifications, but nothing significant).
2. Use replace to substitute unique values for all the duplicates.
3. Use a modified version of my original routine to create permutations,
starting from the base result from (2).
4. Use replace to replace the duplicate values that were made unique in (2).
On my computer the above two arguments process in 0.634 and 0.054 seconds.
That's much faster for the first string, and about as fast for the second
string. Because the second string is all duplicates, it's entirely
serialpermut's work, which is seemingly about as fast as your (Beat's)
permute-duplicates routine.
One other thing I noticed: my routine is much faster for small permutation
jobs, but the gap closes as the number of permutations goes up. My code is
still faster than other routines I tried up to 20 million permutations, but
by that scale the advantage is only about 2:1. So for *really* big jobs, it
would make sense to go with another algorithm.
Here's my code:
on timeX S
-- time a permuation
-- timeX "2,2" returns
-- 0.000073
put the long seconds into T
put PX(S,49) into X
put (the long seconds - T)
end timeX
on testX S
-- test a permuation
-- outputs a string starting from 1
-- with time, the actual and correct number of results, and start and
end samples
-- testX "2,2" returns
--0.000083 6 6
--1133
--1313
--1331
--3113
--3131
--1313
--1331
--3113
--3131
--3311
put the long seconds into T
put PX(S,49) into X
put fact(sum(S)) into F
repeat for each item i in S
divide F by fact(i)
end repeat
put (the long seconds - T) && the number of lines of X && F & cr & cr &
line 1 to 5 of X & cr & cr & line -5 to -1 of X
end testX
function PX N,B
-- N is the list of depths to permute
-- B is the ASCII value to start from
-- PX("1,1",49) returns 21 and 12
-- PX("1,1,1",53) returns 675 765 756 576 657 567
-- PX("2,2,1",49) returns permutations of 11223
sort items of N descending numeric
if item 1 of N = 1 then
-- no duplicates
put B + 1 into bCounter
put numToChar(B) & cr into R
else
-- permute the duplicates
put B into bCounter
repeat for each item i in N
if i = 1 then exit repeat
repeat i
put numToChar(bCounter) after P
end repeat
add i to bCounter
end repeat
put serialpermut(P) into R
-- if nothing but duplicates, return
if item -1 of N > 1 then return R
-- substitute in unique values
put U(R) into R
end if
-- permute the uniques using replace
repeat with Z = bCounter to B + sum(N) - 1
repeat with i = B to Z - 1
put R into T2
replace numToChar(i) with numToChar(Z) in T2
replace cr with numToChar(i) & cr in T2
put T2 after T
end repeat
replace cr with numToChar(Z) & cr in R
put T after R
put empty into T
end repeat
-- substitute back in the duplicates
repeat for each item i in N
if i = 1 then exit repeat
repeat with bCounter = B + 1 to B + i - 1
replace numToChar(bCounter) with numToChar(B) in R
end repeat
put bCounter + 1 into B
end repeat
return R
end PX
function fact X
-- simple factorial to check values
if X = 1 then return 1
put 2 into R
repeat with i = 3 to X
multiply R by i
end repeat
return R
end fact
function U S
-- make the characters in the lines of S unique
-- assumes that each line contains the same set of characters
-- assumes that the characters will not overlap when made unique
repeat with i = 1 to the number of characters of line 1 of S
add 1 to C[char i of S]
end repeat
repeat for each line L in S
repeat for each key K in C
repeat with i = charToNum(K) + 1 to charToNum(K) + C[K] - 1
put numToChar(i) into char offset(K,L) of L
end repeat
end repeat
put L & cr after R
end repeat
return R
end U
function serialpermut pMute
if the number of chars in pMute = 1 then return pMute & cr
put empty into tOutput
-- an entry has
-- item 1 is a prefix
-- item 2 is the remaining set of chars to permute
-- tOutput contains the result of the permutation
put TAB & pMute & CR into todo
set the itemdel to TAB
repeat
if todo is empty then return tOutput
put todo into tDoing
put empty into todo
repeat for each line L in tDoing
put item 1 of L into tPrefix
put item 2 of L into tPerm
switch the number of chars in tPerm
case 1
put tPrefix & tPerm & CR after tOutput
break
case 2
put tPrefix & tPerm & CR after tOutput
if char 1 of tPerm <> char 2 of tPerm then put tPrefix &
char 2 of tPerm & char 1 of tPerm & CR after tOutput
break
default
put empty into tDone
repeat with i = 1 to the number of chars in tPerm
put char i of tPerm into c
if c is among the chars of tDone then next repeat
put c after tDone
put char 1 to i-1 of tPerm & char i+1 to -1 of tPerm into
temp
put tPrefix & c & TAB & temp & CR after todo
end repeat -- over chars in tPerm
end switch
end repeat
end repeat
end serialpermut
More information about the use-livecode
mailing list