[somewhat OT] Text processing question (sort of)
Jim Ault
JimAultWins at yahoo.com
Sun May 18 16:02:05 EDT 2008
Is this slower or about the same? with your data set
[these are not tested, so you many need to tweak syntax]
repeat for each line LNN in myData
get myData
filter it with LNN
put line 1 of it & cr after uniqueOnly
end repeat
get put the number of lines in uniqueOnly
put the number of lines in myData & " minus dups =" & it
of course, making the target data set smaller and smaller has advantages
so adding an IF condition might defeat speed gain near end of 40000 lines...
put empty into uniqueOnly
put myData into remainingLines
put the number of lines in remainingLines into remainingCount
repeat for each line LNN in myData
filter remainingLines without LNN
get the number of lines in remainingLines
if it < remainingCount then --at least one dup found
put LNN & cr after uniqueOnly
put the number of lines in remainingLines into remainingCount
end if
end repeat
get put the number of lines in uniqueOnly
put the number of lines in myData & " minus dups =" & it
If all lines are shorter than 255 chars..
put myData into arrayFood
repeat for each line LNN in arrayFood
put LNN & tab & 1 & cr after tempVar
end repeat
--assming
split tempVar using cr and tab
put the keys of tempVar into uniqueOnly
Try these and see, not that it will be worth all the time and effort. Once
you have a speedy solution, go on to the next task and leave the diving to
to the benchmarkers out there.
Jim Ault
Las Vegas
On 5/18/08 11:27 AM, "jbv" <jbv.silences at club-internet.fr> wrote:
>
> if anyone is interested, while trying to find the fastest way to compare
> each line of a list with every other line, I found the following technique
> quite fast :
>
> -- myData contains the 40000 lines to chack
> -- myData1 is a duplicate of myData
>
> put myData into myData1
>
> repeat for each line j in myData
> delete line 1 of myData1
> repeat for each line i in myData1
> end repeat
> end repeat
>
More information about the use-livecode
mailing list