Challenge...

Alex Tweedly alex at tweedly.net
Tue Jun 20 17:08:55 EDT 2006



Jim Ault wrote:

> There are other strategies for reading a file into memory using
>
>open file fn
>put 1 into x
>put 50000000 into y
>repeat forever 
>read from file fn from x for y characters
>if it is empty then exit repeat --no more chars to process
>  
>
>>         filter it with "*/Separation*"
>>         replace "#20" with space in it
>>    
>>
>put cr & it after vColors
>read from file from x for y characters
>  
>
>>         filter it with "*/DeviceN*"
>>         replace "#20" with space in it
>>         replace "]" with "" in it
>>    
>>
>put cr & it after vColors
>add y to x
>end repeat
>close file fn
>filter vColors without empty
>
>  
>
In my role as curmudgeonly code debugger, I should point out that this 
can fail when the interesting string spans a "block" boundary - though 
since the blocks are 50Mb in size, this is perhaps unlikely. You can 
avoid that risk by doing something like

put empty into lRemainder
repeat
     ...   
     put lRemainder into lBuffer
     read from file from x for y characters
     put it after lBuffer
     put the last line of lBuffer into lRemainder
     ....   (using lBuffer to do the test)
end repeat

This has the disadvantage of one extra copy of the data; the technique 
of putting lRemainder together with the first line of each block, and 
then testing that combined line in addition to the rest of the block is 
left as an exercise for the reader.




>--check the docs for details
>
>Jim Ault
>Las Vegas
>
>On 6/20/06 12:52 PM, "Ton Kuypers" <tkuypers at dmp-int.com> wrote:
>
>  
>
>>Hi gang... I need some help...
>>
>>A user selects a PDF file, I need to know what colors are in this PDF
>>file.
>>So far no good, I can read the data and filter out the unwanted lines.
>>
>>But this becomes a problem when the PDF file is 50 Mb or bigger...
>>
>>At this point I use:
>>
>>         put "file:" & vPDFpath into vURL
>>         put url vURL into vColors1
>>         put url vURL into vColors2
>>         filter vColors1 with "*/Separation*"
>>         replace "#20" with space in vColors1
>>         filter vColors2 with "*/DeviceN*"
>>         replace "#20" with space in vColors2
>>         replace "]" with "" in vColors2
>>         put vColors1 & vColors2 into vColors
>>
>>This way I get the lines containing the PDF colors, which I filter
>>and use.
>>On normal PDF's this happens on the fly, no delay at all...
>>
>>But one of my clients now sent me a 200 Mb PDF... And you can guess
>>the problem: The file is loaded into memory twice, taking up more
>>then 400 Mb of memory, just to get 3 or 4 lines of data... It's
>>ssssllloooooowwwwwwwwww....
>>
>>Any ideas on how to do this faster?
>>    
>>
>
>
>_______________________________________________
>use-revolution mailing list
>use-revolution at lists.runrev.com
>Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
>http://lists.runrev.com/mailman/listinfo/use-revolution
>
>
>  
>


-- 
Alex Tweedly       http://www.tweedly.net

-------------- next part --------------
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.394 / Virus Database: 268.9.1/369 - Release Date: 19/06/2006


More information about the use-livecode mailing list