Extracting text from PDF
Thomas McGrath III
3mcgrath at comcast.net
Thu Feb 28 10:34:22 EST 2008
Late to this discussion:
I just remembered that if the PDF is located on the internet that
GOOGLE will translate the PDF to HTML. Google states that they
automatically generate html versions of ALL documents as they crawl
This is the html version of the file http://www.pewinternet.org/pdfs/PIP%20Bloggers%20Report%20July%2019%202006.pdf
G o o g l e automatically generates html versions of documents as we
crawl the web.
To link to or bookmark this page, use the following url: http://www.google.com/search?q=cache:FRJaOWTjwwgJ:www.pewinternet.org/pdfs/PIP%2520Bloggers%2520Report%2520July%252019%25202006.pdf+pdf&hl=en&ct=clnk&cd=5&gl=us&client=safari
To script this in REV might be hard only because there is no set rule
to the address of the converted version from the original version.
However this might be somewhat useful in dealing with the PDF
revBrowser issue in that a version can be displayed in HTML for most
On Feb 27, 2008, at 5:56 AM, Kay C Lan wrote:
> On Sun, Feb 24, 2008 at 7:43 AM, J. Landman Gay
> Do you have the PDF2RTF service installed? It's required. It's working
>> for me in Tiger.
> Thanks so much Jacque for stepping in; sorry I left some in the
> lurch but I
> had to turn up to my real job and then due to unforeseen
> circumstances ended
> gone much longer than expected :-(
> Yes, in a previous post I gave the URL to Devon Technolgies:
> to pick up PDF2RTFService which is good for 10.4.x and later.
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
More information about the Use-livecode