"ouch: the beginning of the end"
David V Glasgow
dvglasgow at gmail.com
Thu Mar 9 03:46:09 EST 2017
Mark,
This post is hereby awarded the ‘exposition clarity badge’. You should sew it onto your sleeve (with all the others), and wear with pride.
I never felt I understood the problem, let alone possible solutions, and now I think I do both.
A really interesting read, thank you.
Cheers,
David Glasgow
> On 8 Mar 2017, at 11:59 am, Mark Waddingham via use-livecode <use-livecode at lists.runrev.com> wrote:
>
> Hi Dr Hawkins,
>
> I've been away on holiday for just over a week, and this thread has got
> quite long, so I thought it easier to answer the original post rather
> than some off shoot on it.
>
> On 2017-03-03 00:13, Dr. Hawkins via use-livecode wrote:
>> I just got off the phone with the court clerk in Reno, and received the
>> beginning of the end . . .I figured it would come from some state or anther
>> in a year or two, but they are requiring me to use the *exact* pdf as
>> propagated by the court.
>
> Having read the entire thread, my understanding of your problem is as follows
> (please correct if I am wrong):
>
> ----
>
> You have PDF forms which are downloadable from a government department. They
> are intended for filling printing and then filling in - i.e. they do not use
> editable PDF forms (FPDF?).
>
> The government department for whatever reason requires that the forms are used
> exactly as is with the user filling in the relevant spaces within them and then
> submitting.
>
> There is some claim by said department that 'at some point' they will get
> scanners which will be able to tell whether the original forms were used or not
> thus you are not allowed to recreate the non-user parts of the form.
>
> ----
>
> Reading between the lines the latter requirements of the department are not
> unreasonable - I suspect they would like to automate their processes as much
> as possible and as such would like to be able to have a computer via OCR or
> whatever suck out the appropriate parts of forms at some point to remove a
> human from the equation.
>
> Given that there is an obvious 'printing' element involved in this at present
> pixel-perfection is not exactly what they are looking for (unless they are
> imagining they live in a world where all printers are capable of absolutely
> perfect registration - some skew / offset is always going to be present) just
> that whatever software they might use in the future to automate can locate
> the user written parts to suck out - therefore it is reasonable for them to
> require that the non-user sections are relatively laid out and look precisely
> the same as if you printed the original PDF.
>
> I'll run on these above assumptions for now.
>
> ----
>
> First of all let me just point out that EPS is definitely *not* what you want.
>
> EPS is just a PostScript program with appropriate comments describing an
> (optional) pre-rendered thumbnail, and other print related metadata so it
> can be embedded in another document. Rendering EPS properly requires a full
> PostScript interpreter - many programs which 'support EPS' actually only support
> rendering the thumbnail and then only printing on a PostScript printer.
>
> Indeed, there is a good reason why no non-GPL full open-source PostScript
> interpreter exists (as far as I'm aware at least) - they are complex pieces
> of software which have a high degree of commercial value.
>
> Whilst Linux and Mac users might be used to transparent PostScript support this
> is only because GhostScript is installed as an innate part of the printing tool
> chain on those platforms - thus this is an innate part of the 'system' and as
> such you can write non-GPL applications which use it as you don't need to distribute
> it with your app. On all other platforms, however, you are looking at having to
> distribute a PS interpreter with your app - and at that point you are hit by the
> GPL (in particular, in your case, it would classify as an 'innate' requirement
> of your application and non-optional and thus virality would kick in).
>
> So, if you want a PostScript interpreter in your app you are going to have to
> pay $$$$$ to license such a thing. (Including such a thing in LiveCode would
> require license fees or development costs way above what most people would want
> to pay for a feature they would probably rarely if ever use and as such it is
> unreasonable to expect LiveCode to support such things cross-platform as part of
> the standard license fee - event at the Business license level).
>
> One of the main reasons that Adobe created PDF was to avoid needing a PostScript
> interpreter to accurately create 'archival' type quality representations of printable
> documents and to provide a much easier way to edit / amend and modify such documents.
> As PDF is just a data structure the latter can be done with processing a generated
> PDF. As EPS/PS are actually a program all bets are off for editing - the program
> does what it is written to, and you can write it in any way you want. If you want to
> 'edit' it, you need to edit the program.
>
> However....
>
> PDF is also a large complicated format whose reading, writing and rasterisation
> has huge commercial value.
>
> Up until Google bought and open-sourced *part* of FoxIT so they could include a
> full and complete cross-platform PDF renderer in Chrome (in the form of PDFium)
> there was no non-GPL open-source full and complete PDF renderer available in
> the open-source world that I know of.
>
> As far as I'm aware all such open-source libraries for PDF rasterisation and
> manipulation which existed up until that point where GPL and all of them offer
> commercial licensing terms. The costs of which are substantial - again, well
> outside the cost of what you could reasonably expect to get 'built in' to the
> LiveCode license at any level.
>
> Of course, when you look into what Google did you find out that whilst PDFium
> is FoxIT - it is only a *subset* of FoxIT. Google only licensed the rasterisation
> part - PDFium does not contain any of the public APIs which allow editing, merging,
> modification and re-export of PDFs.
>
> Again, you can understand why - the latter part of PDF manipulation has perhaps
> the greatest part of the commercial value and since Google only wanted rasterisation
> that was all they were going to pay for.
>
> ----
>
> So, just to reiterate, the expectation that LiveCode should contain a full PS/EPS/PDF
> rendering, manipulation and 'do whatever I want' type thing in it on all platforms is
> somewhat beyond the current price of the license fee. Or should I say, far beyond what
> anyone one person/organisation who does not need such functionality (which are most people)
> would be willing to pay.
>
> (I should point out here that I know what is involved in writing both a PostScript
> interpreter, and PDF renderer as I have written a partial implementation of both in the
> dim and distant past - for RiscOS in the early 1990's... Back when PS was still mostly
> Level 2, and the PDF spec weighed in at around 150 pages... PostScript is now universally
> at Level 3, and the PDF spec weighs in at 700+ pages - thus I do not begrudge
> the commercialization of such libraries at all as they are large hefty pieces of work which
> have to deal with inputs which may or may not completely conform to specification).
>
> Anyway, bemoaning about the costs of developing and supporting such things aside back
> to your actual problem...
>
> First of all on some platforms what you want to do is actually not all that hard at all.
>
> Mac and iOS both include full built-in PDF rendering and emission support. CoreGraphics
> can both load and render PDF directly *and* also render and save PDF directly which means
> that it is relatively straightforward (with a bit of LiveCode Builder or C++) to do what
> you want - i.e. render an original page of a PDF then render some text on top. However,
> it is important to point out that this approach will not result in the PDF necessarily
> being original PDF + extra bits since you are re-rendering the PDF (although I don't
> think this is a problem in your case as it sounds like there is an implicit may go through
> an actual scanner in the government departments process).
>
> Similarly, Linux always includes a postscript interpreter in its default install if you
> install printing support. PDF can be rendered in PostScript by using an appropriate
> header PostScript program (which converts the PDF data structure into a PostScript
> program - in fact the main rendering bits in PDF are actually PostScript programs
> just with a very fixed set of well defined operators which you can define in a PS
> environment). Thus on this platform you could emit the necessary header, the PDF
> and then the additions you require as PostScript programs.
>
> Where you run into difficulty is on Windows and Android. Neither of these platforms
> include either publicly accessible PDF nor PS support (although it appears Windows
> 10 might have a built in PDF Printer at least...).
>
> ----
>
> So what options are there?
>
> - Option 1 - bi-level background images
>
> Here I'm assuming that your original PDFs do not change that often and (given the
> requirements you have found out from the government department involved) the forms
> must be used as is. Thus, I presume any 'recurring sections' would need to be
> rendered on repeated images of the appropriate page rather than cutting up the
> original forms into pieces and just replicating those parts.
>
> In this case, then pre-rendering all the pages as high-resolution black-and-white
> 1bpp bitmaps and then rendering those underneath the LiveCode fields is probably not
> that bad an option. Given that the average printer people will be using will probably
> only have a true black-and-white resolution of 300-600dpi and most printed forms are
> only about 5% black pixels you will get immensely high compression ratios. The only
> slight snafu here right now is that PDF printing support in LC does not yet exist
> for Android, and would need a small patch to pass PNG data straight through to the
> PDF (at present it only does this for JPEG). [ The reason PDF printing is not currently
> supported on Android is due to text rendering which is not a straightforward thing in
> PDF nor PostScript; the reason only JPEG image data is currently supported is that
> when the pass-through was implemented the library we use to do PDF printing - cairo -
> only supported it for JPEG, I *think* it does support certain PNG formats now though
> since we updated the library for other reasons a while back ].
>
> - Option 2 - augment the original PDF
>
> PDF documents can be augmented after creation - the data structure is designed to
> allow revisions which overlay the original document. Thus it should be possible to
> generate modifications to the original PDF and append them to it.
>
> The difficulty here is that it would require some intimate knowledge of the PDF
> document structure (although far less than what would be required to generate one
> from scratch). Basically, you provide modified page objects for each page and a
> modified 'page tree' which first contains all the original things on the page
> and then adds text objects (which is not too bad to generate if you just want ASCII
> characters in one of the built in fonts such as Helvetica) in the places you need.
>
> Such a process could be implemented in LiveCode Script and would be completely
> independent of platform. Also, it would preserve the original PDF entirely (no
> round-tripping through a PDF rasterizer) as you would only be adding to what
> was already there.
>
> How much work would be involved in writing said script, however, is another matter.
>
> - Option 3 - wait until LiveCode can render PDFs directly as an object on a card
>
> This is obviously what you had hoped you could do and whilst not entirely
> unreasonable, I hope you can appreciate from the above why you currently cannot -
> particular on all platforms.
>
> PDFium does at least give us a starting point - however it isn't the easiest of libraries
> to build or maintain building of and there's still a fair bit of work we need to do to
> allow it to function cross-platform (not least the building of it for all platforms!).
>
> Also, lamentably, that is only one side of the story - you also need to generate PDFs,
> which means some library to output PDF is needed which is happy to bind to PDFium's
> rasterisation implementation. This is certainly not something which is exposed in the
> public APIs of PDFium, and would probably require bespoke customisation of PDFium to
> achieve.
>
> - Option 4 - focus on Mac/iOS and do other platforms later
>
> As mentioned above, both Mac and iOS include PDF rendering and emission as part of
> CoreGraphics - they also include relatively straightforward APIs for drawing typeset
> text. The process here would be:
>
> 1) Create a CG PDF output context
> 2) Load your original PDF as a CG PDF object
> 3) For each page:
> i) Render the original page into the PDF output context
> ii) Render the text into the appropriate places on the page
> 4) Finalize the output context to generate a PDF
>
> I recently did some work for a business services request which needed to render
> portions of a PDF to a new PDF on Mac - and it turned out to be around 50 lines of
> C to do that. Rendering the text you would need through CoreText would be a little
> more than that, but nothing too onerous.
>
> ----
>
> So anyway, sorry to be the bearer of perhaps not entirely great news, however what
> you want to do is certainly possible - but like most things will require some leg-work
> and a little bit of patience and/or some financial investment.
>
> I do strongly suggest you contact business services (https://livecode.com/services/)
> about what you need here. It is important to understand that whilst we would like to
> do everything, we do need a way to prioritise what we focus on. Whilst PDF rendering
> and output features are (obviously) quite widely useful for lots of things they are also
> substantial and large features to develop and maintain (if they weren't we would be
> surrounded by lots of open-source non-GPL implementations to choose from and base them
> on) thus progress on them generally in terms of additions to the core product are likely
> to be slow. However you do have a very specific use-case with well defined inputs and
> outputs so we may be able to help you for far less then it would cost you to commercially
> license the relevant cross-platform libraries you need and/or a platform which provides
> the functionality out of the box. (My gut tells me that starting with Mac/iOS due to
> their built in API support for what you want to do is probably the best first step to take
> at least then you get a product which works as it needs to to - and like any venture, the
> sooner you ship, the sooner you can generate revenue to reinvest and expand!).
>
> Warmest Regards,
>
> Mark.
>
> --
> Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
> LiveCode: Everyone can create apps
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list