Repeat for each

Alex Tweedly alex at tweedly.net
Wed Nov 9 13:03:47 EST 2005


Marty Knapp wrote:

>
> I've been tinkering with this a bit and wanted to ask a few more 
> questions. Again, my data set is 8 items with all but one being a 
> number. I need to be able to select a subset by analysing 1 or more of 
> these items. My current data set is approx 128,000 records. When I 
> filter the data on the item that contains words, it's pretty fast - 
> about 1 second (I have an old Mac G4-single processor 867mgz, and 
> running Rev 2.2.1) The speed is exactly the same whether I use the 
> above method or just 'filter theData with "*word*"'
>
That seems terribly slow to me :-)

I have a file with 120,000 records which look much like

> 2,908,597,451700,398,340,zxcv,3.5

and filtering that with "*zxcv*" takes about 80 - 85 msecs (i.e. less 
than a tenth of a second) on my 2-year old laptop.

(Note - only 1 in 1000 of the records match, so 120 total matches)

Doing the same thing in a repeat for loop takes slightly *less* time avg 
77 ms - here's the code :

on mouseUp
    put URL ("file:D:/Our Documents/Alex/RunRev/asdf.txt") into t
    put the millisecs into tStart
    filter t with "*" & "zxcv" & "*"
    put t into fld "Field"
    put the number of lines in fld "Field" && the millisecs - tStart & 
cr after msg
    put t into fld "Field"
   
    put URL ("file:D:/Our Documents/Alex/RunRev/asdf.txt") into t
    put the millisecs into tStart
    repeat for each line L in t
        if "zxcv" is in L then
            put L & cr after t1
        end if
    end repeat
    put "got" & cr & t1 into fld "FIeld"
    put the number of lines in fld "Field" && the millisecs - tStart & 
cr after msg
    
end mouseUp


Changing that to check which item has the "zxcv" makes it even faster - 
66 msec
code was:

    put the millisecs into tStart
    repeat for each line L in t
        if "zxcv" = item 7 of L then
            put L & cr after t1
        end if
    end repeat

I then tried it with a number (and using a variable instead of a 
constant), and it slowed down slightly to 92 msec

on tryit pNum
    put URL ("file:D:/Our Documents/Alex/RunRev/asdf.txt") into t
    put the millisecs into tStart
    repeat for each line L in t
        if  item 4 of L = pNum then
            put L & cr after t1
        end if
    end repeat
    put "got" & cr & t1 into fld "FIeld"
    put the number of lines in fld "Field" && the millisecs - tStart & 
cr after msg
   
end tryit

So then I tried making the item number and the value vary

Changing it to a more common match  (  >= 451700 giving 11881 matches) 
still only took 278 msec

> When I use the 'repeat for each' to evaluate for 'word' it takes 1.5 
> minutes. Where it really gets slow is evaluating the numbers. Most 
> often what I need is a range of numbers, so would use greater than, 
> less than, or both. Typically I would evaluate 2 of the numbers, but 
> need to be able to evaluate all 8 items if needed. The numbers range 
> from 0 to 8 digits, some whole numbers some fractional. I just did a 
> test evaluating one number and it took 3.5 minutes. When I evaluated 2 
> numbers it took 5.8 minutes.

There's something odd going on.

>
> Do these numbers sound right? Or am I being a bozo somehow! I've been 
> thinking that I should consider using either Valentina or altSQLite. 
> Any input there? Is one more suited, easier or ???
>
Those numbers don't sound right to me. Can you send a sample of the data 
(just 2 or 3 records) to make sure I've not misinterpreted what the data 
looks like ? 
And then maybe send the code snippet that is taking so long ....

Thanks

-- 
Alex Tweedly       http://www.tweedly.net



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.12.8/162 - Release Date: 05/11/2005




More information about the use-livecode mailing list