How to find words and phrases as well

Jim Ault JimAultWins at yahoo.com
Sat Jan 28 20:07:09 EST 2006


On 1/27/06 1:18 PM, "André.Bisseret" <Andre.Bisseret at inria.fr> wrote:
> I would like to be able to find not only words but also phrases.

For searching phrases and combinations of words, the following questions:

?? is the text fixed like a reference book, or updated a on a periodic
basis, which would allow pre-indexing important phrases ("I am I said", "It
is not for me to say","pre-building an index is fun, fast, and fulfilling")
This way the user could find hits to be words or phrases, or both, that have
the same lineOffset

You could pre-build an array of line-number occurrences of each word, then
filter out the occurrences that are not the same for each word resulting in
a list of inclusive hits (words on the same line regardless of order).  This
has the advantage of being updated at the end of each word the user enters
("apple ", "apple orange ", "apple orange pear "[note the space at the end
of each]) becomes a progressively smaller list of hits.

?? could the contents of all the fields be written to one structured or
tagged text file, then that file is loaded into a variable that is searched
for phrases.

?? are matches across lines possible (that is the phrase occurs on line 2 to
3 of a field) so that returns will be embedded in the result string.  Also
applies to hyphenated words, quoted expressions, phrases with commas, etc.

?? could you convert all spaces to a non-breaking space (a) in the fields
the user sees, the fields the user types into  (b) a second version of the
text fields that have spaces converted to some character that allows entire
phrase matching (the_lazy_brown_fox slept in next to the little_brown_jug)

?? would it be easier to collect the hit lines into one result field shown
in a sub stack and then let the user choose.  A substack would allow the
user to view the MainStack while clicking the results substack.  Also, the
user could create a second card in the substack which would show a different
search, or AND its search with the previous card result list.

Just not sure what your objective is for whoever builds the text fields and
whoever uses the stack.  It would help if I knew the source of the text (eg.
web pages, medical reference, user entries, technical manuals, short
stories, news feeds)

It would be cool to know how you solve your challenge.

Jim Ault
Las Vegas


On 1/27/06 1:18 PM, "André.Bisseret" <Andre.Bisseret at inria.fr> wrote:

> Hi ! and thanks for help,
> 
> Every card in my app are including a fld « theText » which displays
> texts.
> In a special fld « keyWords » the user can write one or several words
> separated by spaces, say, theKeyWordsList.
> A button « search » executes « find words theKeyWordsList in fld «
> theText » such as the list of cards whose text includes all the words
> of theKeyWordsList is displayed to the user in a specific results field.
> 
> I would like to be able to find not only words but also phrases. For
> example, the users should be able to enter « user interface » as a
> whole, while now they only can enter the two words « user » and «
> interface ». (not sure I am clear enough ?!).
> How to distinguish, for a find command, « user interface » as a phrase
> from « user interface » as two words.
> 
> André
> from Grenoble
> 





More information about the use-livecode mailing list