Repeat for each
Alex Tweedly
alex at tweedly.net
Wed Nov 9 13:03:47 EST 2005
Marty Knapp wrote:
>
> I've been tinkering with this a bit and wanted to ask a few more
> questions. Again, my data set is 8 items with all but one being a
> number. I need to be able to select a subset by analysing 1 or more of
> these items. My current data set is approx 128,000 records. When I
> filter the data on the item that contains words, it's pretty fast -
> about 1 second (I have an old Mac G4-single processor 867mgz, and
> running Rev 2.2.1) The speed is exactly the same whether I use the
> above method or just 'filter theData with "*word*"'
>
That seems terribly slow to me :-)
I have a file with 120,000 records which look much like
> 2,908,597,451700,398,340,zxcv,3.5
and filtering that with "*zxcv*" takes about 80 - 85 msecs (i.e. less
than a tenth of a second) on my 2-year old laptop.
(Note - only 1 in 1000 of the records match, so 120 total matches)
Doing the same thing in a repeat for loop takes slightly *less* time avg
77 ms - here's the code :
on mouseUp
put URL ("file:D:/Our Documents/Alex/RunRev/asdf.txt") into t
put the millisecs into tStart
filter t with "*" & "zxcv" & "*"
put t into fld "Field"
put the number of lines in fld "Field" && the millisecs - tStart &
cr after msg
put t into fld "Field"
put URL ("file:D:/Our Documents/Alex/RunRev/asdf.txt") into t
put the millisecs into tStart
repeat for each line L in t
if "zxcv" is in L then
put L & cr after t1
end if
end repeat
put "got" & cr & t1 into fld "FIeld"
put the number of lines in fld "Field" && the millisecs - tStart &
cr after msg
end mouseUp
Changing that to check which item has the "zxcv" makes it even faster -
66 msec
code was:
put the millisecs into tStart
repeat for each line L in t
if "zxcv" = item 7 of L then
put L & cr after t1
end if
end repeat
I then tried it with a number (and using a variable instead of a
constant), and it slowed down slightly to 92 msec
on tryit pNum
put URL ("file:D:/Our Documents/Alex/RunRev/asdf.txt") into t
put the millisecs into tStart
repeat for each line L in t
if item 4 of L = pNum then
put L & cr after t1
end if
end repeat
put "got" & cr & t1 into fld "FIeld"
put the number of lines in fld "Field" && the millisecs - tStart &
cr after msg
end tryit
So then I tried making the item number and the value vary
Changing it to a more common match ( >= 451700 giving 11881 matches)
still only took 278 msec
> When I use the 'repeat for each' to evaluate for 'word' it takes 1.5
> minutes. Where it really gets slow is evaluating the numbers. Most
> often what I need is a range of numbers, so would use greater than,
> less than, or both. Typically I would evaluate 2 of the numbers, but
> need to be able to evaluate all 8 items if needed. The numbers range
> from 0 to 8 digits, some whole numbers some fractional. I just did a
> test evaluating one number and it took 3.5 minutes. When I evaluated 2
> numbers it took 5.8 minutes.
There's something odd going on.
>
> Do these numbers sound right? Or am I being a bozo somehow! I've been
> thinking that I should consider using either Valentina or altSQLite.
> Any input there? Is one more suited, easier or ???
>
Those numbers don't sound right to me. Can you send a sample of the data
(just 2 or 3 records) to make sure I've not misinterpreted what the data
looks like ?
And then maybe send the code snippet that is taking so long ....
Thanks
--
Alex Tweedly http://www.tweedly.net
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.12.8/162 - Release Date: 05/11/2005
More information about the use-livecode
mailing list