Speed

Tue Sep 17 22:06:01 EDT 2002

On Wednesday, Sep 18, 2002, at 10:37 Australia/Sydney, Jim Hurley wrote:

> I have a question about optimization.

Jim

Two answers.

Yes, your revised code runs faster because you are breaking it into 
chunks (I assume in both instances you are dealing with a variable and 
not a field). Using an array structure may be faster if you find a 
sensible way to do it.

However, repeat for each will work for you, so long as you "remember" 
the last line, discarding it for the new one only when it no longer 
matches. Practically any additional processing you do during the repeat 
for each loop will be negligible in its time cost compared with 
indexing through with a variable, so you can even handle multiple lines 
this way, or extract a block comprising one multi-line address each 
time.

regards
David
>
> I am helping a local candidate with their database. It is a large 
> county election database which I have imported into a field within > Rev.
>
> This is a voter database in which we would like to identify a single 
> addresses for all voters within a given households so that we do not 
> have to send multiple letters to individual voters within the same 
> household. This makes a big difference in mailing costs.
>
> I found that my original program runs prohibitively slowly.
>
> But I find when I break up the data into smaller blocks, things run 
> much more rapidly. For example I use the following code:
>
>   repeat with k = 0 to 8
>
>     put line k*1000 to (k+1)*1000+1 of tField into temp
>
>     put identifyUniqueAddresses(temp) into a[k]
>
>   end repeat
>
>
>
> so that the data in the variable tField is broken up into 9 chunks of 
> 1000 lines each. Later I reassemble the results from the array, a[k].
>
> If instead I try to run the whole field at once using:
>
>        identifyUniqueAddressess(tField)
>
> I would have to wait all day for the data in tField to process.
>
> (I have not found a was to use:
>
>        repeat for each line tLine in tField
>
> I have to be able to discover whether *successive* lines in the sorted 
> data share the same address.)
>
> Now I'm sure my handler, identifyUniqueAddresess, is not the most 
> efficient code, but my question is this:  Why does the handler run so 
> much more rapidly working on several smaller chunks which are later 
> reassembled rather than all at once?
>
> I suspect the problem may be successively pulling up lines of text 
> from a very long list of lines. Would it help if I first put the lines 
> into an array and then worked with the array?
>
> Is there an optimizer out there? Gentlemen and gentle ladies, start 
> your engines.
>
>  
>
> --
> Jim Hurley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 2881 bytes
Desc: not available
URL: <http://lists.runrev.com/pipermail/use-livecode/attachments/20020917/65f07215/attachment.bin>