how to read an specific page into pdf document?

Kay C Lan lan.kc.macmail at gmail.com
Mon Nov 14 23:42:53 EST 2011


I know I'm very late to the party but anyone who is doing this on Mac needs
to be aware of one very large WARNING. This will NOT work as expected on
anything other than simple page layouts.

As of Snow Leopard, Apple made PDF rendering 'smarter'. That is smarter
like auto text correction on smart phones. Whilst this technology works a
good percentage of the time, it also doesn't work when you really need it
to.

Prior to SnoLeo Copying text from a PDF document (be it in Preview, Skim or
Reader) and Pasting it into TextEdit (or similar) would result in each
whole line of text appearing on the equivalent line in the text document.
Good if the document is a single column, generally bad if it isn't. To
compensate for this you could press the Option key whilst dragging across
the text and Jaguar/Tiger/Leopard changed to Column mode and would try to
auto figure out where the columns were and when you Pasted, Column 2 text
would appear under Column 1 text - neat, a time saver in most cases.

As of SnoLeo this feature is automatically turned On and as far as I know
it can't be turned Off. I have no idea about Lion.

This is great if your documents are simple 1, 2 or more 'basic' columns of
text. It is a huge pain in the backside if the page looks like columns of
text but isn't. A case in point is any table; it will come out as each
column rendered one below the other as a single column in your text
document. This is 'fixable' in LC if every cell of the table contains data
AND the same amount of lines of data, but if any are blank, then the data
is truncated and it becomes a nightmare to correct. If a table appears at
the centre of a page of two columns of text squeezing around it, and some
cells contain multiple lines of text whilst the majority don't, the result
is pretty scrambled and virtually impossible to correct.

This 'smart feature' is built into the Quartz graphic engine and therefore
effects every application that uses it, including Skim, Acrobat Reader and
most unfortunately Devon Technologies PDF2RTFService, which I've used for
years to get pdf as text into Rev/LC (highly recommended, it's free from
here:  http://www.devontechnologies.com/download/index.html). AppleScript's
'get text' uses the same engine and is equivalent to Copy.

One workaround is in Acrobat Reader you can use the Save as.... and select
Text. Reader knows exactly which is columns, what is a table and will
render a good text version*. But Reader is not AppleScriptable so you can't
automate this with LC. Some AS gurus here might know how to use AppleEvents
to force Reader to submit to LC's will, but that's beyond me.

The only automated solution I've found, which is only viable for
me/personal use, is I have a G5 that my wife uses and runs Leopard. I have
a drop box there I send pdfs to which AppleScript and PDF2RTFService then
converts to 'unscrambled' text files.

Before you go firm with the previously provided solution you better be sure
it works with ALL the likely layouts of pdf files that are going to be
thrown at it, otherwise a multitude of support calls await.

I'm meaning to ask the Forums at Apple if PDF has been fixed in Lion, but
the 5 min I've just had spare I've spent here :-)

* The Leopard 'get text' version actually renders a better text version
than Reader's 'Save as.... Text'. With Reader any gaps between data is
rendered as a single space character, with Leopard it attempts to
substitute a variable number of space characters depending on the size of
the gap.

On Thu, Oct 20, 2011 at 6:03 PM, Francis Nugent Dixon <effendi at wanadoo.fr>wrote:

> Hi from Beautiful Brittany,
>
> Thanks to Ken Ray for his "close to a one-liner".
>
>
>  put char 1 to 3 of field "MyPagesC" into GVPage
>> put "tell app " & quote & "Skim" & quote & cr & \
>>  "set " & LVDeskTop & " to path to desktop as string" & cr & \
>>  "open (" & LVDeskTop & " & " & quote & "SkimTest1.pdf" & quote & ")" &
>> cr & \
>>  "tell document 1" & cr & "go to page " & GVPage & cr & \
>>  "set result to (get text for page " & GVPage & ") as text" & cr & \
>>
>>  "end tell" & cr & "end tell" into GVMasterScript
>>
>> do GVMasterScript as AppleScript
>> put char 2 to -2 of the result into field "MySkim"
>> show field "MySkim"
>>
>
>
>>
> I have added Kens small correction to his post of yesterday
> (he gave me the correction off-forum)
>
> Great Stuff - Works a treat !
>
> Sigh ! - I have a long way to go to master AppleScript ...
>
> -Francis
>
>
> "Nothing should ever be done for the first time !"
>
>
>
> ______________________________**_________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>



More information about the use-livecode mailing list