Reading PDF - a cry for help

Graham Samuel livfoss at mac.com
Sat Oct 1 08:22:52 EDT 2011


Thanks Bernard. I will look into (1), although I have no idea of the resource implications, but for (2) I don't think my publisher would consider a solution involving the permanent involvement of an external server, plus the fact that I know nothing about Linux, plus the fact that the owner of the PDF (usually a UK government agency or licensee) might consider transfer to an external server to be a violation of its terms of use, plus the fact that since mine is a commercial product, there would presumably be licensing issues for a non-GNU developer. But an interesting thought.

Graham

On 30 Sep 2011, at 19:08:31 +0100, Bernard Devlin <bdrunrev at gmail.com> wrote:
> 
> I have a couple of suggestions (although I am not sure either will
> work as smoothly as Graham wants, but my still be worth  a try).
> 
> 1. display the pdf in a browser control, snapshot the window, present
> the snapshot to the user to crop to just the image.
> 2. assuming that there is a linux solution (I've used pdf2txt or some
> such on Linux to extract the text out of 1000 page pdf files), create
> your own webservice that will accept a PDF file and a page number, and
> it returns an image of the page to be cropped by the user.  I have
> created such a web service before which took files in various "office"
> formats and returned the data from the files (using OpenOffice running
> headless on the linux server to extract the text).  Whilst such a
> service might seem like a lot of work to setup, it is going to be
> easier than writing an external or (I would imagine) parsing
> PostScript (although I do have the PostScript manuals and
> specification lying around here somewhere in PDF format).  You can get
> your own VPS at Linode for approx $20 a month.
> 
> Bernard




More information about the use-livecode mailing list