Challenge...
Alex Tweedly
alex at tweedly.net
Tue Jun 20 16:43:41 EDT 2006
Ton Kuypers wrote:
> Hi gang... I need some help...
>
> A user selects a PDF file, I need to know what colors are in this PDF
> file.
> So far no good, I can read the data and filter out the unwanted lines.
>
> But this becomes a problem when the PDF file is 50 Mb or bigger...
>
> At this point I use:
>
> put "file:" & vPDFpath into vURL
> put url vURL into vColors1
> put url vURL into vColors2
> filter vColors1 with "*/Separation*"
> replace "#20" with space in vColors1
> filter vColors2 with "*/DeviceN*"
> replace "#20" with space in vColors2
> replace "]" with "" in vColors2
> put vColors1 & vColors2 into vColors
>
> This way I get the lines containing the PDF colors, which I filter
> and use.
> On normal PDF's this happens on the fly, no delay at all...
>
> But one of my clients now sent me a 200 Mb PDF... And you can guess
> the problem: The file is loaded into memory twice, taking up more
> then 400 Mb of memory, just to get 3 or 4 lines of data... It's
> ssssllloooooowwwwwwwwww....
>
> Any ideas on how to do this faster?
Maybe I'm missing something - but would it help to do
put "file:" & vPDFpath into vURL
put url vURL into vColors1
filter vColors1 with "*/Separation*"
replace "#20" with space in vColors1
put url vURL into vColors2
filter vColors2 with "*/DeviceN*"
replace "#20" with space in vColors2
replace "]" with "" in vColors2
put vColors1 & vColors2 into vColors
i.e. all I did was move the put url vURL into vColors2 down
until after the filter had been done on vColors1
Reduces the space needed to merely one * 200 Mb.
If that isn't enough - then it depends on whether you need to have a
full copy of the data in memory for any other purpose or not.
--
Alex Tweedly http://www.tweedly.net
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.394 / Virus Database: 268.9.1/369 - Release Date: 19/06/2006
More information about the use-livecode
mailing list