regex/HTMLText question

Kay C Lan lan.kc.macmail at gmail.com
Wed Dec 16 00:48:34 EST 2009


On Wed, Dec 16, 2009 at 5:44 AM, Jim Ault <jimaultwins at yahoo.com> wrote:

> Caution: wordoffset, replace, regEx
> You need to decide what constitutes a word.
> In Rev,
> ending ending. ending, ending?  ending!   ending)   ending]   ending"
> ending's   ending=  (ending)
> are all words, so the last word in a phrase or sentence cannot be matched
> by wordoffset without a bit of rule checking for punctuation.
>
> Chris,

One way around this is to use token rather than word, look it up in the Rev
Dictionary.

You need to read it a couple of times and unfortunately it's slightly
erroneous in that some characters, like %, is also a token but not listed in
the first group. But if you experiment a bit you'll find that in all the
examples above only: including ending; ending:

ending. (period)
ending?
and the single instance of a double quote (everything after it....
disappears....)

are the only ones to cause you problems. Also as eluded to in the
Dictionary, anything between a pair of double quotes is a single token, so
"this is the ending, almost" will appear to Rev as a single token.

So certainly not without it's own pitfalls, but with token there are far
less punctuation characters you have to deal with compared to using word.

NOTE the Note in the Rev Dictionary entry for token. If this is for a
commercial app then maybe token isn't something you should be working with
as it's clearly designed by the Rev team purely to work with the Rev
language, and I suppose subject to change. On the other hand, if you're
writing something one off, I'd go for it.

HTH



More information about the use-livecode mailing list