the large file challenge

Sadhunathan Nadesan sadhu at castandcrew.com
Fri Nov 8 21:19:00 EST 2002


| Try something alike :
| 
| > on mouseup
| > put "1" into startread
| > open file thefile for read
| > read from file thefile until eof
| > put the num of lines of it in endtoread
| > close file thefile
| > repeat while startread < endtoread
| > open file thefile for read
| > read from file thefile at startread for 99 lines
| > ...
| > do what you need with it
| > ...
| > close file thefile
| > add 100 to startread
| > end repeat
| > end mouseup


Alors, Pierre,

Many thanks.  This turned out to be more efficient than
I thought.  I had to modify it slightly because the 
'read from file at' command takes an offset in characters,
not lines.  (Code below).  Anyway, on those 3 sample
programs, here are the times on the last run, not my full
access log, but a chopped (50,000 lines) snippet.

	Bash shell script (interpreted)	24 seconds
	Pascal (compiled) 7 seconds
	Metacard (interpreted) 2 minutes 50 seconds

So, any takers on the speed challenge?  Here's the code
I used.

#!/usr/local/bin/mc
on startup
  put "/gig/tmp/log/xaa" into the_file
  put 1 into start_read
  put 0 into the_counter
  put 1 into the_offset
  open file the_file for read
  read from file the_file until eof
  put the num of lines of it into end_read
  close file the_file
  repeat while (start_read < end_read)
    open file the_file for read
    read from file the_file at the_offset for 99 lines
    put it into the_text
    put the number of chars of it + the_offset into the_offset
    repeat for each line this_line in the_text
      if (not eof) then
	if (this_line contains "mystic_mouse") then
	  put the_counter + 1 into the_counter
	end if
      end if
    end repeat
    close file the_file
    add 100 to start_read
  end repeat
  put the_counter
end startup


Now, I feel sure we could improve this, fix my errors, etc  .... anyone?

Sadhu



More information about the metacard mailing list