Parsing a PDF file

-hh hh at hh.on-rev.com
Mon Jul 11 11:04:17 EDT 2016


> Roger E.  wrote:
> > Since this seems to be Mac only, why not "do as Applescript" then select
> > all, and Copy?
Kay C. L. wrote
> Because Preview isn't properly scriptable and you can't "Select All"
> or "Copy". As Richard said, the answer is with Automator. 

Automator is a GUI to "bundled" Applescript routines. May be an alternative
way here to use directly Applescript, because that's easier to
"adjust-if-needed":

Here is a LC script and an AppleScript that together do the PDF2TXT job.
It's pretty slow but it's delivering tables a little bit better "formatted"
than pdfToText does and AcrobatReader's "Save as text" does.

I prefer to separate the steps and watch the process (activated apps).
[a] Download all files to a folder.
[b] Convert all pdf files of that folder to text into that folder.
[c] Work on the converted files.

The following works here, running MacOS 10.11.5, with LC 6/7/8.
Probably you need at least MacOS 10.6.

To step [b]:

[1] Allow "Accessibility" as described
    and put the following into a field "AScript"
-- begin field
-- Needs assistive access enabled:
--  Before MacOS 10.11:
--   System preferences/Accessibility --> Enable access for assistive d.
--  MacOS 10.11 and later:
--   System preferences/Security&Privacy/Accessibility --> add Livecode
tell application "Preview"
  activate -- when activated you see menu "Edit" highlighting on/off
  set myPath to "xxxx"
  open myPath
  tell application "System Events"
    tell process "Preview"
      tell menu bar 1
        click menu item "Select All" of menu "Edit"
        click menu item "Copy" of menu "Edit"
      end tell
    end tell
  end tell
  close document 1
end tell
-- give Preview some time, else the script may appear "unstable"
delay 7 -- (seconds) adjust to the speed of your machine
tell application "Livecode" to activate
--end field

[2] Make a button "Convert PDFs" with the following script
--begin script
-- the path to the folder where all your PDFs reside
local PDFfolder="/Users/admin/Downloads/precincts"

on mouseUp
  set defaultfolder to PDFfolder
  put the files into ff
  filter ff with "*.pdf"
  put field "AS" into aScript
  repeat for each line f in ff
    put aScript into fScript
    put PDFfolder & "/" & f into f0
    replace "//" with "/" in f0
    do fScript as applescript
    go this stack
    set itemdelimiter to "."
    put "txt" into last item of f0
    set itemdelimiter to comma
    put clipboardData["text"] into url ("file:" &f0)
    -- put f0 & cr before fld "jobsDone" -- for testing
  end repeat
end mouseUp
--end script




--
View this message in context: http://runtime-revolution.278305.n4.nabble.com/Parsing-a-PDF-file-tp4706466p4706577.html
Sent from the Revolution - User mailing list archive at Nabble.com.




More information about the use-livecode mailing list