Splitting up a large text file with 'seek relative'
Mark Smith
mark at maseurope.net
Fri Jun 20 11:18:25 EDT 2008
Hugh, I just ran your handler on a 1gig file of random binary data -
I didn't see any slowdown - I added a bit of benchmarking code:
.....
open file tFilePath for binary read
set the numberformat to "####" --| So file names have leading zeroes
put the millisecs into markerTime
repeat
set the cursor to busy
add 1 to n
read from file tFilePath for 1000000
put the result="eof" into isEOF
if (it="") then exit repeat
if isBinary then put it into URL("binfile:"& tDir&"/" &n& ".txt")
else put it into URL("file:"& tDir&"/" &n& ".txt")
if n mod 100 = 0 or isEOF then
put (the millisecs - markerTime) / 100 & " : " after timeList
put the millisecs into markerTime
end if
if (isEOF or the result <>"") then exit repeat
end repeat
close file tFilePath
put timeList
the output was : 0096 : 0094 : 0103 : 0103 : 0102 : 0104 : 0106 :
0101 : 0107 : 0103 : 0048 :
As you can see - no significant slowdown. Is the hard disk you're
writing to very full? Maybe it gets harder to find space as the loop
goes on.
Best,
Mark
On 20 Jun 2008, at 08:31, Hugh Senior wrote:
> You are right, but logging still shows a cumulative slowdown as
> each chunk is 'read', and the computer slows to a crawl. Using
> 'read from ... for ...' is even slower, however. (The source file
> is a 1 GIG binary text file)
>
> Given tFilepath, write out 1Mb files sequentially numbered...
> put the hilite of btn "Binary" into isBinary
> if isBinary then open file tFilePath for binary read
> else open file tFilePath for text read
> set the numberFormat to "####" --| So file names have leading zeroes
> seek to 0 in file tFilePath
> repeat
> set the cursor to busy
> add 1 to n
> --| seek relative 0 in file tFilePath --| Redundant
> read from file tFilePath for 1000000
> put the result="eof" into isEOF
> if (it="") then exit repeat
> if isBinary then put it into URL("binfile:"& tDir&"/" &n& ".txt")
> else put it into URL("file:"& tDir&"/" &n& ".txt")
> if (isEOF OR the result <>"") then exit repeat
> end repeat
> close file tFilePath
>
> Any further insights would be truly welcomed.
>
> /H
>
> ----------------------------------------------
> Hugh, it strikes me that the "seek relative 0" might be redundant -
> and may be slowing things down.
>
> Best,
>
> Mark
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
More information about the use-livecode
mailing list