the large file challenge

andu undo at cloud9.net
Sun Nov 10 18:27:01 EST 2002


--On Sunday, November 10, 2002 13:21:04 -0800 Sadhunathan Nadesan 
<sadhu at castandcrew.com> wrote:


Here's another try for whatever it's worth. I tested it on a file with 7000 
lines of about 800k and it takes less then a sec:

on startup
put 0 into tCount
  put "mystic_mouse" into tWord
  put empty into line 3000 of tChunk
  put "/gig/tmp/log/access_log" into tFile
  open file tFile for read
  put 0 into fOffset
  repeat
    read from file tFile at fOffset+1 for 3000 lines
#can play with that number for best results
    put it into tChunk
    put 0 into tSkip
        repeat
      get offset (tWord,tChunk,tSkip)
      if it is not 0 then
        add 1 to tCount
        add it+length(tWord) to tSkip
      else
        put 0 into tSkip
        exit repeat
      end if
    end repeat
        add length(tChunk) to fOffset
    if the num of lines of tChunk<3000 then exit repeat
  end repeat
  put tCount
end startup

Regards, Andu Novac



More information about the metacard mailing list