HTML to text in field

David V Glasgow dvglasgow at gmail.com
Thu Aug 9 08:00:40 EDT 2018


Hello folks,

I am having an interesting time (MacOS 10.13.5 LC 8.1.9) trying to load some HTML files (≤ 5 ish MB).  Most of them will be lists or tables, generated by various users on various systems.

I don’t want to retain any of the formatting, except line endings, so I would be happy for tables to appear as lists.  I found a little 2013 nugget from the estimable  Jacqueline Landman Gay

set the htmltext of the templatefield to htmlVar -- variable contains the html string
put the text of the templatefield into tPlainText

In some cases that works fine, but in others, it seems that HTML tables consisting  of maybe 20-30 thousand rows are rendered onto a single line of the field.  A sort of black-letters-overwritten splodge appears in the first row and LC cranks up to 100% of the processor and BBoD ensues.

Sometimes it never seems to recover, but other times it hands back control after maybe 20 minutes or so, and in those cases I can see the text if I set dontwrap to false.  It contains no line endings from the original table, and a shedload of tabs.

I have tried to operate on the HTML string in a variable before putting it into the field, but frankly don’t really know what property of some HTML tables might mean that line endings are lost.  I can only see </tr> when I examine the files in an editor.  

I tried a different approach, replacing a row end with a cr, and then stripping out tags:

put URL ("file:" & theFilePath) into ttemp

replace "</tr>" with cr in ttemp

replaceText (ttemp, "<*>", "|")

filter lines of ttemp without empty

set the text of field "import" to ttemp


The replaceText line generates an error “button "Import HTML": execution error at line 7 (Handler: can't find handler) near "replaceText", char 1”  

Firstly I don’t get the error, and secondly I am worried I may be over complicating something which should be simple.

Advice please!

Best wishes,

David Glasgow


More information about the use-livecode mailing list