the large file challenge

Pierre Sahores psahores at easynet.fr
Sat Nov 9 08:22:02 EST 2002


Sadhunathan Nadesan a écrit :
> 
> | Try something alike :
> |
> | > on mouseup
> | > put "1" into startread
> | > open file thefile for read
> | > read from file thefile until eof
> | > put the num of lines of it in endtoread
> | > close file thefile
> | > repeat while startread < endtoread
> | > open file thefile for read
> | > read from file thefile at startread for 99 lines
> | > ...
> | > do what you need with it
> | > ...
> | > close file thefile
> | > add 100 to startread
> | > end repeat
> | > end mouseup
> 
> Alors, Pierre,
> 
> Many thanks.  This turned out to be more efficient than
> I thought.  I had to modify it slightly because the
> 'read from file at' command takes an offset in characters,
> not lines.  (Code below).  Anyway, on those 3 sample
> programs, here are the times on the last run, not my full
> access log, but a chopped (50,000 lines) snippet.
> 
>         Bash shell script (interpreted) 24 seconds
>         Pascal (compiled) 7 seconds
>         Metacard (interpreted) 2 minutes 50 seconds
> 
> So, any takers on the speed challenge?  Here's the code
> I used.
> 
> #!/usr/local/bin/mc
> on startup
>   put "/gig/tmp/log/xaa" into the_file
>   put 1 into start_read
>   put 0 into the_counter
>   put 1 into the_offset
>   open file the_file for read
>   read from file the_file until eof
>   put the num of lines of it into end_read
>   close file the_file
>   repeat while (start_read < end_read)
>     open file the_file for read
>     read from file the_file at the_offset for 99 lines
>     put it into the_text
>     put the number of chars of it + the_offset into the_offset

>     repeat until lineoffset("mystic_mouse",the_text) = 0
>       if lineoffset("mystic_mouse",the_text) is not "0" then
>           put the_counter + 1 into the_counter
>           delete line 1 to lineoffset("mystic_mouse",the_text) of the_text
>       end if
>     end repeat

>    # repeat for each line this_line in the_text
>    #   if (not eof) then
>    #     if (this_line contains "mystic_mouse") then
>    #       put the_counter + 1 into the_counter
>    #     end if
>    #   end if
>    # end repeat

>     close file the_file
>     add 100 to start_read
>   end repeat
>   put the_counter
> end startup
> 
> Now, I feel sure we could improve this, fix my errors, etc  .... anyone?
> 
> Sadhu
> _______________________________________________
> metacard mailing list
> metacard at lists.runrev.com
> http://lists.runrev.com/mailman/listinfo/metacard

Allo Sadhu,

Perhaps is it way to speed up your script in using the lineoffset
statement, as the upon proposal ;)
-- 
Cordialement, Pierre Sahores

Inspection académique de Seine-Saint-Denis.
Applications et bases de données WEB et VPN
Qualifier et produire l'avantage compétitif



More information about the metacard mailing list