Building a search engine into a project

Wilhelm Sanke sanke at hrz.uni-kassel.de
Fri May 9 04:32:01 EDT 2003


The search routine I talked about on Tue 6 has been accelerated and I
also tried out a routine using arrays.

As I did so far did not pay much attention to text searching - apart
from from my script search utility, where speed was not that important -
I am really astonished about the potential of Metacard/Revolution for
dealing with text search in "free-form databases", i.e. even without the
assistance of a database engine. I hope I do not bore you too much with
this matter, as many list members may be familiar with these issues -
this is relatively new to me.

When using the same features as in the first version

> During the search the following information is shown:
>
> - the number of cards still to be searched
> - the number of fields in which the searchstring was found
> - the cumulative number of hits in these fields
>
> When the search is completed the displayed results comprise:
>
> - the address of the hit: name of field, ID of field, name of card, ID of card
> - the text of the lines of the field with the found searchstring along with the line number
> - the searchstring in each found line is displayed in red
>
> When you click at the line of the address, the respective card of the searched stack is immediately shown.
>
>
the search time for the about 14 thousand text fields of the test stack
(old Transcript Dictionary) is now down from 22 to only 4 seconds (All
benchmarks for a Windows computer with 800 MHz; on a slower Mac G 4 with
667 MHz and MacOS 10.2.4 the search time is about twice that figure).

If the information displayed during the search is omitted (number of
fields still to be searched, constantly updated progress bar, cumulative
number of hits) completing the search and displaying the results takes
only two and a half seconds.-

It seems that a different  number of fields searched on each of the 1152
cards of the teststack does not result in any difference in speed, what
counts is rather the number of cards.

Using an approach with arrays:

Putting all the text of the old Revolution Dictionary into an array
takes 1574 milliseconds and an additional 40 milliseconds to store that
array in a custom property (if that is intended).

Each subsequent search after that needs only between 350 to 400
milliseconds including the time needed to color the searchstrings in the
found text lines.-

The modeless searchstack searches the current topstack and recognizes a
change when a different stack is made the topstack. As the searchstring
is stored in the dialog of the searchstack, it is therefore possible to
do a search with the same searchstring quickly across several stacks.

The search routines are not adapted to a specific arrangement of  text
fields (but could certainly be optimized for that). They search all
visible and hidden fields of a stack..

Regards,

Wilhelm Sanke





More information about the use-livecode mailing list