Reading PDF - a cry for help
Graham Samuel
livfoss at mac.com
Fri Sep 30 11:02:53 EDT 2011
Thanks to all who replied. As the one who started this thread, I'd like to say that I pretty much despair of finding a solution. The current position seems to me admirably summarised by Paul Dupuis (see below). Suggestions that I use a command-line utility seem to me to come down to using ImageMagick since I have not found any other solution that has the right functionality and licensing terms - but when I looked into it, although I freely admit I know almost nothing about the internal workings of Windows, it seemed to me that IM is a resources hog that would not be amenable to a simple installation process hidden from the user; and that operating it so as to provide a LiveCode window containing the relevant representation would not be straightforward and would certainly mean clunky intermediate files. So it would be very very different from an 'import paint' situation. Bear in mind that I am not interested in the text in a PDF, just the image content (just a bitmap really), so things like IM are overkill for me anyway. But this 'modest' requirement hasn't got me any nearer a solution.
What really annoys me is that if I were writing my app in Visual Basic, I suspect there would be library components available with the right licensing terms, but the promise of a simple 'glue' or 'wrapper' capability to tie LiveCode to third-party externals whose APIs were not written with LiveCode in mind has not been fulfilled, even though it has been proposed in some versions of the LC documentation.
Before I completely give up I will go round the ImageMagick route once more, since I suppose I may have misunderstood its resource requirements, and it does have the advantage of being able to read TIFFs, which is another problem I have (also not likely to be on LiveCode's radar despite QA requests).
As a last remark, I'd be interested to know of the details of ANY implementation of adding functionality of any kind to LiveCode via a third-party application and 'shell'. I have never seen this in action and I can't remember it being demonstrated by anyone on this list - but maybe I just wasn't paying attention.
Graham
On 9/29/2011 10:01 AM, Graham Samuel wrote:
Short of RunRev itself extending input formats to include PDF (not impossible, but not likely in the short term), the solution would seem to be to licence a third-party library component and integrate it into my app by the use of bridging ('glue') code. I got pretty near with this one, having identified a component with suitable licensing terms and functionality (Sorax DLL). RunRev suggested that I could do the gluing with the aid of a 'C' programmer. It turns out after a lot of research by Thierry Douez, who has been helping me, that what I need is a person familiar with Visual Studio to accomplish this - but I despair of finding such a person who would also be familiar with the externals interface of the LiveCode engine. Maybe I will find such a person, but the trail does seem to have gone cold.
Has anyone any suggestion as to how I might proceed? My app works so nicely with JPG and PNG files, and I have (a little) belief that I could make it work with TIFF files, but without PDF input I am dead in the water.
As some folks may remember, I have posted to this list a number of time
on the need for being able to open and read PDF content (text and
images) in LiveCode. We at Researchware have, I think, thoroughly
explored this topic. It all boils down to the fact you need 3rd party
technology that can read the PDF format and render it and/or extract the
text from it.
For pages as images or unstylized text, the cheap and dirty way is to
use a 3rd party command-line utility to make your conversions. From a
script perspective, you perform an answer file command, get the PDF
file, and then use shell to batch convert it and then read the resulting
text file or image file(s) back in. There is NO other free way to do
this. Yes, this is ugly and probably not for the novice scripter and you
code pretty much has to be platform specific, but again, it is the ONLY
free way to do this. You are also not every displaying a real PDF - you
are either displaying images of pages OR the unformatted, unstyled text.
You can also do a limited form of displaying a PDF in a window (you
can't get or copy any selections/content in it though and can only
navigate under script control by page) through InterApplication
communication (IAC)
To open a PDF in LiveCode where you can actually control navigation
through script control and get or set the user selections required two
things: (a) a PDF library with APIs supporting these actions and (b)
creating a set of LiveCode externals that in turn use the PDF APIs to
provide these functions. The main problem with this approach is that all
(or all we could find) of the open source or free PDF libraries are
woefully immature and lack major functionality. Only commercial PDF
technology has the supported APIs for this and whether Adobe, Foxit or
other commercial PDF technologies providers, all charge typically based
upon a per unit shipped royalty model. And some, like Adobe, are really
expensive.
I used the revPartner program to explore this with RunRev quite some
time ago and asked whether they would consider support in the engine. At
the time, they only said it was not practical due to licensing issues
(or close to that). What I understand now is that is becuase the open
source PDF libraries are crap and the commercial ones woudlld have
imposed an entirely different licensing model for LiveCode - one with
runtime royalties - which I think none of us want (RunRev or Developers).
I am afraid, as of September 2011, that is that state of LiveCode and
PDFs. There is a promising free open source GNU effort out there
(http://gnupdf.org/Library), but most of the libraries are only 30 or
40% complete. When this is complete, we can all benefit from free PDF
support in LiveCode by wrapping the external API around the GNU effort.
Until then, you have to choose between cheap, dirty, and limited OR
costly and commercial.
--
Paul Dupuis
Cofounder
Researchware, Inc.
http://www.researchware.com/
More information about the use-livecode
mailing list