word counts - what is going on?

Peter M. Brigham pmbrig at gmail.com
Tue Aug 14 13:17:19 EDT 2012


On Aug 14, 2012, at 12:07 AM, James Hale wrote:

> …. After a bit of to'ing and fro'ing I think I see the problem.
> Word boundaries.
> For example, quoted text is considered 1 word.
> Removing the quotes is ok to find the included words within the quoted text but from then on the word number is out of whack.
> By this I mean that if I see that word 45 of line 1223 is "World" for example, I can't simply hilite  word 45 of line 1223 of mytext field and expect the hilite to fall on "World".
> 
> It would seem that to avoid these contradictory requirements (need to keep quoted text versus identifying words within the quote) I might need to revisit character positions.
> 
> So, back to the drawing board.

Would it work for you to replace quotes with curly quotes (ASCII 147 and 148) before doing anything? I have long been bothered by the HC-legacy rule of regarding anything within quotation marks as one word. I solved it by having a keydown handler in my main user input field that resulted in no straight quote chars in the field at all. That way I could use text parsing routines that were more rational. Use something like this to clean up your text:

if quote = char 1 of tText then put numtochar(147) into char 1 of tText
replace space & quote with space and numtochar(147) in tText
replace cr & quote with cr and numtochar(147) in tText
replace "(" & quote with "(" and numtochar(147) in tText
replace "[" & quote with "[" and numtochar(147) in tText
replace tab & quote with tab and numtochar(147) in tText
replace quote with numtochar(148) in tText

-- Peter

Peter M. Brigham
pmbrig at gmail.com
http://home.comcast.net/~pmbrig





More information about the use-livecode mailing list