text copied form LC generated PDF, WTF?
Klaus major-k
klaus at major-k.de
Thu Feb 20 09:31:30 EST 2020
Hi Mark,
> Am 20.02.2020 um 14:55 schrieb Mark Waddingham via use-livecode <use-livecode at lists.runrev.com>:
>
> On 2020-02-18 18:40, Klaus major-k via use-livecode wrote:
>> Hi friends,
>> I know that copying text form a PDF file may result in unexspected results,
>> but this is really ridicoulous!?
>> I created a PDF from LC (selected "Save as PDF" in the macOS Print dialog)
>> and when I copy some text and past it into TextEdit, this is what i get:
>> <https://major-k.de/staxx/text_from_lc_pdf.jpg>
>> Where on earth are my numbers and where is my text?
>> Any insights very appreciated!
>
> As requested by Klaus on the forum thread (http://forums.livecode.com/viewtopic.php?f=9&t=33683&start=15) this isn't a bug.
> TL;DR version - extracting text from PDFs is hard, and viewers all do it differently with different levels of 'correctness'.
> The fonts used and the layout can affect what they can produce.
> In this case, the stack in question was being printed with the default system theme fonts (on macOS this is .SFNSText it would seem) - and for whatever reason that font generates glyphs for numbers in the PDF which PDF viewers don't seem to be able to map back to actual digits.
> Upshot - make sure the controls you are printing have an explicit font setting to a 'normal' font if you want to be able to copy text from any PDF you might generate as a result :)
thank you very much for this valuable hint! :-D
> Warmest Regards,
>
> Mark.
Best
Klaus
--
Klaus Major
https://www.major-k.de
klaus at major-k.de
More information about the use-livecode
mailing list