HTML to text in field

Richmond Mathewson richmondmathewson at gmail.com
Thu Aug 9 09:08:37 EDT 2018


Well, although this might sound a bit goofy,
how about just bunging your html text into a scrolling list field and 
every time you
came across a </tr> moving down to a new line?

This works:

on mouseUp
    set the text of fld "SLF1" to empty
     answer file "Choose an HTML file to import"
    if the result = "cancel"
    then exit mouseUp
    else
       put (URL ("file:" & it),"UTF8") into CHEESE
    end if
    put 1 into LYNE
    repeat until CHEESE is empty
       if word 1 of CHEESE is "</tr>" then
          add 1 to LYNE
          delete word 1 of CHEESE
          else
             put word 1 of CHEESE after line LYNE of fld "SLF1"
             delete word 1 of CHEESE
             end if
    end repeat
end mouseUp

where fld "SLF1" is a scrolling list field.

Admittedly it makes a pig's breakfast out of everything else.

Richmond.



On 9/8/2018 3:00 pm, David V Glasgow via use-livecode wrote:
> Hello folks,
>
> I am having an interesting time (MacOS 10.13.5 LC 8.1.9) trying to load some HTML files (≤ 5 ish MB).  Most of them will be lists or tables, generated by various users on various systems.
>
> I don’t want to retain any of the formatting, except line endings, so I would be happy for tables to appear as lists.  I found a little 2013 nugget from the estimable  Jacqueline Landman Gay
>
> set the htmltext of the templatefield to htmlVar -- variable contains the html string
> put the text of the templatefield into tPlainText
>
> In some cases that works fine, but in others, it seems that HTML tables consisting  of maybe 20-30 thousand rows are rendered onto a single line of the field.  A sort of black-letters-overwritten splodge appears in the first row and LC cranks up to 100% of the processor and BBoD ensues.
>
> Sometimes it never seems to recover, but other times it hands back control after maybe 20 minutes or so, and in those cases I can see the text if I set dontwrap to false.  It contains no line endings from the original table, and a shedload of tabs.
>
> I have tried to operate on the HTML string in a variable before putting it into the field, but frankly don’t really know what property of some HTML tables might mean that line endings are lost.  I can only see </tr> when I examine the files in an editor.
>
> I tried a different approach, replacing a row end with a cr, and then stripping out tags:
>
> put URL ("file:" & theFilePath) into ttemp
>
> replace "</tr>" with cr in ttemp
>
> replaceText (ttemp, "<*>", "|")
>
> filter lines of ttemp without empty
>
> set the text of field "import" to ttemp
>
>
> The replaceText line generates an error “button "Import HTML": execution error at line 7 (Handler: can't find handler) near "replaceText", char 1”
>
> Firstly I don’t get the error, and secondly I am worried I may be over complicating something which should be simple.
>
> Advice please!
>
> Best wishes,
>
> David Glasgow
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode




More information about the use-livecode mailing list