Working with csv files that are 5000 lines or more

Terry Judd tsj at unimelb.edu.au
Wed Apr 9 21:06:53 EDT 2008


Hi Jim,

Take a look at the 'repeat for each line' repeat structure. It's much faster than the 'repeat with i = 1 to the number of lines in....' method you're probably already using.

So for example, if you wanted to iterate through every line and every item in each line you would do something like...

answer file "Select a data file to parse:"
put url ("file:"&it) into tData
put 0 into tRecordNum
repeat for each line tRecord in tData
  add 1 to tRecordNum
  -- this keeps a record of what line you're up to (if you need it)
  -- do something to your line data here if required
  put 0 into tItemNum
  repeat for each item tItem in tRecord
    add 2 to tItemNum
    -- this keeps a record of what item you're up to (if you need it)
    -- do something to your item data here if required
  end repeat
end repeat

You'll find this method is VERY fast and shoudn't suffer from any noticeable slowdown. Just don't try to manipulate tData within the repeat loops. If you need to modify it, either create a copy of it and work on that or 'build' a new one on the fly as you iterate through the lines and items within the repeat loops.

However, if all you want to do is find one or more instances of a particular items you are probably better off going with a routine based on the lineOffset function. Finding a partial match is easy...

put lineOffset("xzy","tData") into tLineNum
if tLineNum > 0 then
  -- you've got a match in line tLineNum
  -- iterate through the items to find the position of the one you're after
  -- or use the itemOffset function to track it down
end if

If you need to find a complete match (i.e. the whole item) then first add a comma to the beginning and end of each line in tData...

put CR before tData
put CR after tData
-- so you start and end with a blank line
replace CR with CR&","
-- puts a comma at the start of each line
replace CR&"," with ","&CR&","
-- put a comma at the end of each line
delete line 1 of tData
delete last line of tData
-- get rid of those empty lines
put lineOffset(",xyz,",tData) into tLineNum

...and then take it from there

HTH,

Terry...

-----Original Message-----
From: use-revolution-bounces at lists.runrev.com on behalf of Jim Schaubeck
Sent: Thu 4/10/2008 10:35 AM
To: use-revolution at lists.runrev.com
Subject: Working with csv files that are 5000 lines or more
 
My next project is reading in a csv file as large as 7000 lines with 60 items per line.  The "read file..." and "put it into tempvar" command work very quickly.  When I search the data in tempvar, the repeat command works quickly for the first 1000 lines or so then things slow down dramatically.  I tested it with a scrollbar being updated for every line and 1000 seems to be the break point.  Using "Find..." in excel works very quickly but the same file in my stack is slow.  I'll be loading my app onto 20 or so other users so a database may not be an option unless I can load it into the stack (never tried to include an actual database in my apps).

Any ideas on how to search through csv files that are 2000 to 7000 lines?

Thanks in advance
Jim...
_______________________________________________
use-revolution mailing list
use-revolution at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution



More information about the use-livecode mailing list