Parsing a PDF file

Richard Gaskin ambassador at fourthworld.com
Sat Jul 9 11:54:26 EDT 2016


Jim Hurley wrote:

 > Thanks Richard.
 >
 > You are so right about releasing data in complex formats.
 > I spoke to the election's off about posting election results in PDF
 > format.
 > I knew there was not use fighting them when they told me that it was
 > now County "policy" to post everything in PDF--not unlike those 10
 > policies of renown that were carved in stone--and a metaphor was born.

Unfortunate, as it renders the data nearly useless.  I agree you need to 
pick your battles, but it's dismaying in an ostensible democracy when 
the process of open data for civic-minded citizens is implemented in 
ways that ultimately deliver the opposite of the intended goal.

Across the US we're beginning to see a revolution in government data 
sharing.  At the municipal level one of the shining examples has been 
Raleigh, NC, in no small part due to the work of Jason Hibbets.  He 
works as the Community Manager for Red Hat, and has devoted significant 
volunteer time working with city officials to make data available so 
local devs can deliver apps for the community.

Notes on his work and a link to his excellent book, "The Foundation for 
an Open Source City" (I got a signed copy when I met him at the SoCal 
Linux Expo a couple years ago) is here:
http://theopensourcecity.com/

The slides from the SCaLE talk where I met him are linked to from this 
page outlining his presentation:
http://www.socallinuxexpo.org/scale12x/presentations/open-source-all-cities.html


 > In the County's old system, each of the 50 election precincts were
 > stored in 50 web pages as HTML documents.
 > That was perfect for LiveCode's "get url". It was a matter of second
 > to  visit all 50 pages, parse the text, and store the data.

So much for progress. ;)

Too often we see Cargo Cult thinking in data management, where folks 
start using a tool or a format only because they hear about it others, 
but since they don't actually use the system they're delivering they 
never come to understand what's useful and what's an impedance.


 > (The other two text options in Adobe are "Rich Text Format" and "Text
 > (Plain)", neither of which works--only "Text (Accessible)"

What is "Text (Accessible)"?


 > I was unaware of Apple's Automator. I'll look into it--but it is
 > unnecessary for this project.

Warning:  Automator is a lot of fun, and may be addictive.  Be careful 
playing with it, since you may find yourself experimenting with all 
sorts of things and before you know it your Saturday is completely gone. :)

-- 
  Richard Gaskin
  Fourth World Systems
  Software Design and Development for the Desktop, Mobile, and the Web
  ____________________________________________________________________
  Ambassador at FourthWorld.com                http://www.FourthWorld.com





More information about the use-livecode mailing list