New chunks

Jim Hurley jhurley0305 at sbcglobal.net
Tue Mar 11 18:34:35 EDT 2014


Can someone explain how the “sentence" chunk would work?
How are decimal points, and points in an abbreviation distinguished from the “period” that deliniates the end of a “sentence?”
Does it presume that the exitsing text has special embedded “periods?”

I’ve written my own, but it is very cumbersome and not flawless. I use it to do manuscript analysis.
Like: Find all sentences in which “time” and “party” occur anywhere in the same sentence.

My ignorance on unicode is profound.
Jim

C
> Message: 15
> Date: Tue, 11 Mar 2014 18:15:18 +0000
> From: Benjamin Beaumont <ben at runrev.com>
> To: LiveCode Developer List <livecode-dev at lists.runrev.com>, 	How to
> 	use LiveCode <use-livecode at lists.runrev.com>
> Subject: New chunks
> Message-ID:
> 	<CADd0_Txbhdem4PbKXifXUsujqPLs9HROME6vKhF=Sk1zNp29cQ at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> Hi All,
> 
> We're in the process of adding some new chunk types in LiveCode 7 and we
> would appreciate suggestions for a particular chunk name.
> 
> The new chunk types are:
> 
> naturalword (breaks on unicode word boundaries)
> sentence (breaks on unicode sentence boundaries)
> paragraph (Same behaviour as current 'line' chunk)
> 
> The first chunk is called 'naturalword' because 'word' is already in use.
> Renaming the current 'word' chunk to 'token' to free up 'word' is not an
> option for backward compatibility. We are also limited by the current
> parser which doesn't allow us to use the form:
> 
> put natural word 1 of "this is a string of words"
> 
> 'naturalword' is the clearest internal suggestion at the moment and we'd
> love to get the input from community members if there is an even clearer
> option.
> 
> Warm regards and thank you for your input.
> 
> Ben
> 
> _____




More information about the use-livecode mailing list