How to filter a big list

Jim Ault jimaultwins at yahoo.com
Wed Oct 21 23:41:25 EDT 2009


your asking a lot of a chunking function to scan a large body of text  
between key strokes.

Start with the following steps to see if these help.

-1-  Showing a list of more than 50 hits may not be useful
-2-  Doing an filter operation with less than 3 chars may not be useful
-3-  Showing the number of lines (hits) at the top of the field is  
useful
-4-  Most likely you will need to pre-index the 400K lines to get more  
speed

Indexing is what data bases do to boost speed.  You need to decide  
what the logic is, such as any char in any string, or words beginning  
with the user input, etc.

Is the 400K set of lines dynamic or static?
Does the user type logical words, or phrases?
eg.  santos  -- single word
eg.  Gourgas  -- single word
eg.  dos santos  -- phrase in order

eg.  rue Gourgas  --phrase in order

If link tables are required, then you should consider a database,  
since this is something they do well.


  if the number of chars in userInput < 3 then exit to top

    put "Number of lines = " && \
        the number of lines in filteredBlock into theOutput


if the number of lines in filteredBlock > 50 then
   put  line 1 to 10 of block & cr & "MORE" after theOutput


The fewer characters in the block of lines to be filtered, the better.


Hope this helps.

Jim Ault
Las Vegas



On Oct 21, 2009, at 8:47 AM, Jérôme Rosat wrote:

> Thank you Jim, Richard, Brian and Mark,
>
> Please excuse me to answer so tardily, I posted a message yesterday,  
> but it was not published in the list. I make a new attempt.
>
> I explained in my message that I wish to filter a list of names and  
> addresses dynamically when I type a name in a field. This list  
> contains 400'000 lines like this:  Mme [TAB] DOS SANTOS albertina  
> [TAB] rue GOURGAS 23BIS [TAB] 1205 Genève
>
> I made various tests using the "repeat for each" loop and the  
> "filter ... with" command. Filtering takes the most time when I type  
> the first and the second letter. That takes approximately 800  
> milliseconds for the first char and about 570 milliseconds for the  
> second char. The repeat loop with the "contains" operator is a  
> little beat slower (about 50 milliseconds) than the "filter ...  
> with". There is no significant difference when the third char or  
> more is typed. Of course I filter a variable before to put it in the  
> list field.
>
> Obviously, 800 milliseconds to filter a list of 400'000 lines, it is  
> fast. But it is too slow for what I want to do. It would take a time  
> of filtering lower than 300 milliseconds so that the user is not  
> slowed down in his typing.
>
> Sorry to have been insufficiently precise in my first message. I  
> continue my tests and I will publish the fastest code.
>
> Jerome Rosat



More information about the use-livecode mailing list