HTMLtext doesn't play well with CSS
Richard Gaskin
ambassador at fourthworld.com
Sat Jul 17 12:50:17 EDT 2010
Tim Ponn wrote:
> I want the user to be able to change font sizes, make bold,
> italic, whatever. I also want them to have the freedom to
> turn some of the text into links, etc. When I try to use
> HTMLtext in rev, the results are not so good. How do I
> improve it?
As Jim pointed out, the htmlText of a field is not true HTML in the
browser sense; it could more accurately be called "xmlText" because it
uses XML tags to represent style runs, but is not designed to be
web-ready HTML.
The htmlText property was added to the engine to provide something no
other xTalk had, which is very, very useful: a plain-text description
of everything in a field, both content and style attributes. Unlike
rtfText, htmlText is designed to be the one way a field's content and
styles can be reproduced in another field with complete fidelity. As
such it includes tags like threeDBox which is supported in the Rev
engine but not in HTML, and is missing a good many HTML things like CSS.
One useful thing about htmlText is that the order of tags is fairly
consistent when you obtain that property from a field, regardless of the
tag order you may have used to set those attributes.
For example, you can use this:
set the htmlText of fld 1 to "<i><b>Hello</b></i>"
...and when you get the htmlText you'll get:
<p><b><i>Hello</i></b></p>
Note the reversal of the order of <i> and <b>. This happens because the
storage format of htmlText is a binary representation in which those
flags have fixed positions, so it can parse an htmlText string to set
those binary flags but once set they're in whatever order the engine
stores them, and retrieving them will translate them from the binary
form to the text tags in that order.
This can be useful because it can allow you to predict what certain
combinations of attributes will be, and then do a search-and-replace to
swap 'em out for CSS assignments.
For example, if you had a CSS class named "MyClass" which sets the bold
and italic of text, you could write this to translate the htmlText to
use CSS assignments:
replace "<b><i>" with "<span style='MyClass'>" in tData
replace "</b></i>" with "</span>" in tData
At first this seems excitingly easy to deal with, extensible as it can
be to include font size, font face, and other aspects.
But then we come to nested tags, and meet with a grave disappointment. :(
This htmlText:
<p><u><b>Hello</b></u><b> <i>world</i></b>.</p>
...describes the style runs for the words "Hello World" in which both
words are in bold but "Hello" is underlined and "World" is italicized.
Note what happened to "<b>" there: it got replicated to enclose each
word separately, as opposed to this form which would be more common in HTML:
<p><u><b>Hello</u> <i>world</i></b>.</p>
In some cases this won't be a big problem, since while it adds a bit of
bloat to the page it can still allow simple wholesale replacements to be
used to assign classes.
But there may be times where it's not sufficient, requiring you to parse
tags by examining them in sequence (see the optional third argument to
the offset function for a good way to make a pull-parser), omitting
redundant tags.
With WebMerge, the revJournal blog, and some custom CMS solutions for
clients, I've had to deal with these sorts of issues myself. In those
contexts the efficiency of the page generation was a higher priority
than the cleanliness of the resulting HTML, so I opted for what could
arguably called laziness in how those tags are dealt with. ;)
It would be ideal if we had a nicely generalized function like this:
webHtml(pHtmlText, pCss)
...in which pHtmlText is the raw htmlText of a field and pCss is a set
of CSS definitions. The function could then parse the text, look for
tag patterns which can be satisfied by the various CSS definitions
supplied, and replace those htmlText tag sequences with appropriate
class and style assignments as needed.
Unfortunately I have no such function in my libraries. It's been on my
to-do list, but has been a much lower priority than other things which
actually get done. :)
Given the complexity of the task, this might make a good exercise for
the readers here. As often happens here, folks would likely submit
different forms, each more complete and hopefully faster than the last,
and if the process follows historic norms at the end Alex Tweedly will
come up with a three-line solution using arrays. :)
If the function were made public domain or MIT license, it could be used
in commercial projects as well as open source projects without legal
encumbrance.
Anyone up for such a task?
I'll offer a code bounty of $100 for any reasonably efficient function
that does that.
--
Richard Gaskin
Fourth World
Rev training and consulting: http://www.fourthworld.com
Webzine for Rev developers: http://www.revjournal.com
revJournal blog: http://revjournal.com/blog.irv
More information about the use-livecode
mailing list