word counts - what is going on?
Peter M. Brigham
pmbrig at gmail.com
Tue Aug 14 13:17:19 EDT 2012
On Aug 14, 2012, at 12:07 AM, James Hale wrote:
> …. After a bit of to'ing and fro'ing I think I see the problem.
> Word boundaries.
> For example, quoted text is considered 1 word.
> Removing the quotes is ok to find the included words within the quoted text but from then on the word number is out of whack.
> By this I mean that if I see that word 45 of line 1223 is "World" for example, I can't simply hilite word 45 of line 1223 of mytext field and expect the hilite to fall on "World".
>
> It would seem that to avoid these contradictory requirements (need to keep quoted text versus identifying words within the quote) I might need to revisit character positions.
>
> So, back to the drawing board.
Would it work for you to replace quotes with curly quotes (ASCII 147 and 148) before doing anything? I have long been bothered by the HC-legacy rule of regarding anything within quotation marks as one word. I solved it by having a keydown handler in my main user input field that resulted in no straight quote chars in the field at all. That way I could use text parsing routines that were more rational. Use something like this to clean up your text:
if quote = char 1 of tText then put numtochar(147) into char 1 of tText
replace space & quote with space and numtochar(147) in tText
replace cr & quote with cr and numtochar(147) in tText
replace "(" & quote with "(" and numtochar(147) in tText
replace "[" & quote with "[" and numtochar(147) in tText
replace tab & quote with tab and numtochar(147) in tText
replace quote with numtochar(148) in tText
-- Peter
Peter M. Brigham
pmbrig at gmail.com
http://home.comcast.net/~pmbrig
More information about the use-livecode
mailing list