Working with csv files that are 5000 lines or more

Jim Schaubeck schaubeck at mac.com
Wed Apr 9 21:50:57 EDT 2008


Thank you Richard.  Great feedback.  That makes complete sense as to why repeat for each line is lightning fast

 
On Wednesday, April 09, 2008, at 06:36PM, "Richard Gaskin" <ambassador at fourthworld.com> wrote:
>Jim Schaubeck wrote:
> > Very good feedback for me.  You are correct , the method I was
> > using was very slow (I had no idea).
>
>Superficially the two main forms of repeat look very similar, but under 
>the hood they do very different things.
>
>When you do this:
>
>repeat with i = 1 to the number of lines of tData
>    get line i of tData
>end repeat
>
>...that second line has to count the lines from 1 to i each time through 
>the loop.  That's why you saw the increasing slowdown the farther it got 
>into the data.
>
>But when you do this:
>
>repeat for each line tLine in tData
>   get tLine
>end repeat
>
>...then the engine makes the assumption that the data in tData won't be 
>changing, so it doesn't need to count as it goes.  Instead it parses as 
>it goes, automatically putting the value of each line into tLine each 
>time through the loop.
>
>For data sets with just 5000 records, or even 50,000 records, you 
>probably don't need an RDBMS to handle them.
>
>In cases where you're processing all records in sequence you probably 
>don't even need an array.  Arrays are lightning fast for random access, 
>such as when you have a list of keys and you need to retrieve their 
>values.  But the split and combine commands are very computationally 
>intensive, so for sequential processing of the full data set the 
>overhead of split and combine usually benchmarks as taking longer than 
>simply using "repeat for each" on the delimited text.
>
>My WebMerge customers regularly process data sets in the hundreds of 
>thousands of lines, and write me happy notes about how good the 
>performance is. :)
>
>I wish I could take credit for it, but really it's all Scott Raney, the 
>fella who owned the engine at the time the "repeat for each" form was 
>added. It's a godsend.
>
>-- 
>  Richard Gaskin
>  Managing Editor, revJournal
>  _______________________________________________________
>  Rev tips, tutorials and more: http://www.revJournal.com
>_______________________________________________
>use-revolution mailing list
>use-revolution at lists.runrev.com
>Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>http://lists.runrev.com/mailman/listinfo/use-revolution
>
>



More information about the Use-livecode mailing list