problem with counting words

Kay C Lan lan.kc.macmail at gmail.com
Thu Oct 16 05:39:44 EDT 2014


On Wed, Oct 15, 2014 at 12:58 AM, Peter Haworth <pete at lcsql.com> wrote:

>
> a=1,b=2,c=3
>
> That's pretty basic and is easily handled by:

replace comma with cr

Then just work through the lines and the itemDelimiter to =.

What I was thinking about was:

"A","B","C","D" etc

and being able to specify the chunkDelimiter as ","

Of course this too could be handled by:

replace quote & comma & quote with cr, but you are left with a leading and
trailing ".

The more real world example is how LC currently defines words. As far as I
can tell, using regex syntax, I think LC uses [ \f\n\r\t]+

For those unfamiliar with regex that deciphers to any whitespace chars
(carriage return, line feeds, tabs, etc) can appear together as many times
as possible, but at least once. That's why LC will correctly count the
words in this example, but if the delimter was changed to space, as has
been suggested, you'll get seven:

put "one  two  three  four" into tStore
--simulate a word being delimited by space
set the itemDelimiter to space
--item is now equivalent to word
put the number of items of tStore

A current limitation with itemDelimiter, regardless to what it is set, but
lets go with the default comma

one , two , three,four

If I refer to any of these items, only the 4th will not include white
space, all others, and therefore in all my handlers, if I don't want to
include the whitespace, then I have:

word 1 to -1 of item x of ...

Not hard, but it would certainly save a enormous amount of file preprep if
I could:

set the chunkDelimiter to "[space comma quote]+" --one or more of

"one "," two "," three","four"

put chunk 2 would result in 'two' without any whitespace around it. And
chunk 1 would be 'one', no leading quote and no trailing space.

Having now thought about it a bit, one BIG GOTCHA I see is if multi
character chunkDelimiter was implemented then it could behave just like LC
words do, and by that I mean you can NOT have an empty word, unlike items
which you can.

1","2","3","    ",, ,,," "," "7"," ",","  --using same chunkDelim as above

Is only 4 chucks with no empty chunks between 3 and 7 and no trailing empty
chunks after 7.

one    two          three                  seven                 -- lots of
trailing tabs and spaces

put empty into word 2

one              three                  seven

put "two" into word 2

one              two                  seven



More information about the use-livecode mailing list