Read and Analyze Giant Files
    Sannyasin Sivakatirswami 
    katir at hindu.org
       
    Thu Nov  7 23:49:00 EST 2002
    
    
  
We are trying to use Rev (or MC) to analyze a web site access log that 
is 3 million lines long, a 300 meg (or more) file.
If I try a shell script (interpreted) or pascal program (compiled) each 
runs in about 2 minutes on this file but an xTalk script takes a very 
long time, maybe it hangs forever?  shell and pascal can read the file 
one line at a time and process the line but not sure how to do it in mc.
Here's the code
shell, 2 lines
#!/bin/sh
fgrep mystic_mouse | wc -l
pascal, 16 lines
Program detect;
{$H+}
Var
  buffer    : string;
  result    : integer;
Begin {main}
  buffer := '';
  result := 0;
  While (Not(eof)) Do
  Begin
    Readln(buffer);
    If (pos('mystic_mouse', buffer) > 0) Then
      inc(result);
  End; {file}
  Writeln(result);
End. {program}
metacard, 13 lines
#!/usr/local/bin/mc
on startup
  put empty into the_message
  put 0 into the_counter
  read from stdin until empty
  put it into the_message
  repeat for each line this_line in the_message
    if (this_line contains "mystic_mouse") then
      put the_counter + 1 into the_counter
    end if
  end repeat
  put the_counter
end startup
is there a more efficient way in Transcript to do this?
Well, to try it out, I chopped down the log to a half million lines, 
then I got these times:  shell script, 22 seconds.  Pascal 7 seconds.  
metacard 13 minutes and still running ??
I would like never to have to say that "Metacard/Revolution can't do 
this"
Before I tackle it further I was thinking someone had already invented 
this wheel.
Thanks!
Himalayan Academy Publications
Sannyasin Sivakatirswami
Editor's Assistant/Production Manager
katir at hindu.org
www.HinduismToday.com, www.HimalayanAcademy.com,
www.Gurudeva.org, www.hindu.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 2317 bytes
Desc: not available
Url : http://lists.runrev.com/pipermail/metacard/attachments/20021107/b6464991/attachment.bin
    
    
More information about the metacard
mailing list