Deleting Data Woefully Slow

Kay C Lan lan.kc.macmail at gmail.com
Wed Mar 24 20:28:00 EDT 2010


In the current project I'm working on I've discovered a bottle neck. Whilst
investigating ways to try and speed things up I've discovered the following,
which our resident Benchmark Meister is probably fully aware of but I'd
thought I'd share with everyone else.

In my project I have a nested

repeat for each line
    repeat for each line

where in both cases there are 1.4 million lines.

I thought I was going to speed things up by deleting unnecessary lines,
instead I saw a massive slow down.

In summary: (test figures below)
put chunk after scales well
put chunk before doesn't  scale well
delete line -1 is woefully slow and doesn't scale
delete line 1 doesn't scale well.

Interestingly, 'delete line 1' is the opposite of 'put chunk before' and
their performance is roughly equivalent, the same can not be said for
'delete line -1' which is the opposite of 'put chunk after'. And, in the
case of 'delete line -1' it's quicker to use it in a 'repeat with x ='
statement rather than the usual speedy 'repeat for each' .

What I've discovered is it can be significantly faster to recreate 90% of
your data than try and delete 10% of it; I'd even go as far as to suggest
you're better off recreating 99% of your data than trying to delete 1%!

Below are the test results I got on a MBP 2.16GHz, 2 GB RAM, OS X.6.2, Rev
Studio 4.0.0 Build 950

5000 repeats
**repeat for each**
put after = 2 millisec
delete line -1 = 287 millisec
put before = 11 millisec
delete line 1 = 9 millisec
**repeat with x = **
put after = 137 millisec
delete line -1 = 197 millisec
put before = 227 millisec
delete line 1 = 49 millisec
Create 90% - repeat for each = 3 millisec
Create 90% - repeat with x = 309 millisec
Delete 10% = 383 millisec

50000 repeats
**repeat for each**
put after = 19 millisec
delete line -1 = 35497 millisec
put before = 935 millisec
delete line 1 = 925 millisec
**repeat with x = **
put after = 17403 millisec
delete line -1 = 17632 millisec
put before = 23939 millisec
delete line 1 = 4580 millisec
Create 90% - repeat for each = 25 millisec
Create 90% - repeat with x = 32891 millisec
Delete 10% = 33751 millisec

The script I used:

on mouseUp
   repeat 5000 times --change as necessary
      put random (9) & cr after tData
   end repeat
   --test 1
   put the millisec into tStart
   repeat for each line tLine in tData
      put tLine & cr after tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal1
   --test 2
   put the millisec into tStart
   repeat for each line tLine in tData
      delete line -1 of tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal2
   --test 3
   put the millisec into tStart
   repeat for each line tLine in tData
      put tLine & cr before tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal3
   --test 4
   put the millisec into tStart
   repeat for each line tLine in tData
      delete line 1 of tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal4
   --test 5
   put the millisec into tStart
   repeat with tCounter = 1 to the number of lines of tData
      put line tCounter of tData & cr after tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal5
   --test 6
   put the millisec into tStart
   repeat with tCounter = the number of lines of tData down to 1
      put line tCounter of tData & cr after tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal6
   --test 7
   put the millisec into tStart
   repeat with tCounter = 1 to the number of lines of tData
      put line tCounter of tData & cr before tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal7
   --test 8
   put the millisec into tStart
   repeat with tCounter = 1 to the number of lines of tData
      delete line 1 of tData1
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal8
   --test 9
   put the millisec into tStart
   repeat for each line tLine in tData
      if NOT(tLIne contains 1) then
         put tLine & cr after tData1
      end if
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal9
   --test 10
   put the millisec into tStart
   repeat with tCounter = 1 to the number of lines of tData
      if NOT(line tCounter of tData contains 1) then
         put line tCounter of tData & cr after tData1
      end if
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal10
   --test 11
   put tData into tData1
   put the millisec into tStart
   repeat with tCounter = the number of lines of tData down to 1
      if NOT(line tCounter of tData contains 1) then
         delete line tCounter of tData1
      end if
   end repeat
   put the millisec into tEnd
   put tEnd - tStart into tTotal11

   put "**repeat for each**" & cr into msg
   put "put after = " & tTotal1 & " ms" & cr after msg
   put "delete line -1 = " & tTotal2 & " ms" & cr after msg
   put "put before = " & tTotal3 & " ms" & cr after msg
   put "delete line 1 = " & tTotal4 & " ms" & cr after msg
   put "**repeat with x = **" & cr after msg
   put "put after = " & tTotal5 & " ms" & cr after msg
   put "delete line -1 = " & tTotal6 & " ms" & cr after msg
   put "put before = " & tTotal7 & " ms" & cr after msg
   put "delete line 1 = " & tTotal8 & " ms" & cr after msg
   put "Create 90% - repeat for each = " & tTotal9 & " ms" & cr after msg
   put "Create 90% - repeat with x = " & tTotal10 & " ms" & cr after msg
   put "Delete 10% = " & tTotal11 & " ms" & cr after msg
end mouseUp



More information about the use-livecode mailing list