Quickest was to compare 2 CR lists?

Jim Ault JimAultWins at yahoo.com
Tue Nov 4 11:46:28 EST 2008


Or you could use the "among" form
(instead of contains with cr at each end)

"if theLine is among the lines of list2 then"
-- among the items   --  among the words


Of course there is the issue of multiple matches (duplicate lines)
and in that case, keys in an array will reduce this to a single entry.

If preserving the duplicate count is the goal, then use the filter command

repeat for each line theLine in list1
   get list2
   filter it with theLine
   put it & cr after newList
   put the number of lines in it & comma after ckSum
end repeat
filter newList without empty
answer max(ckSum) & " was the maximum number of hits for this list"
--newList should contain all duplicate lines that match

Note to Richard Gaskin, the benchmark master:  Is there any benefit to using

sort list1 numeric ascending by length(each)
sort list2 numeric ascending by length(each)
--then doing 
repeat for each
   ...
end repeat
--such that the lists are skewed so that the shortest comparisons will occur
closest to the beginning of the variable, as opposed to the extreme of the
shortest match happening at the end of the variable list?  Obviously the
same number of match operations must occur, but the sorted lists might yield
more time savings than the sorting operations consume.  Perhaps the reverse
is true (sort descending).

My guess is that short lists of short lines will make no difference,  such
as lists of 2000 lines would be considered short.

Hope this helps

Jim Ault
Las Vegas

On 11/4/08 1:37 AM, "Terry Judd" <tsj at unimelb.edu.au> wrote:

> That looks pretty good to me. Maybe put a CR at the front and back of list 2
> and then...
> 
> local newList
> 
> repeat for each line theLine in list1
>   if list2 contains cr&theLine&cr then
>     put tLine&cr after newList
>   end if
> end repeat
> delete last char of newList
> 
> ...that way you'll avoid partial matches (unless you want them).
> 
> Terry...
> 
> 
> On 4/11/08 8:08 PM, "Klaus Major" <klaus at major-k.de> wrote:
> 
>> Hi all,
>> 
>> anyone knows the quickest ways to compare 2 CR delimited lists?
>> I need to know what lines of list 1 are contained in list 2.
>> 
>> Right now I am using repeat "for each" and "lineoffset", which is fast,
>> but I'm ure this can be done even faster :-)
>> 
>> List 1 = k1
>> List 2 = k2
>> 
>> ...
>>    repeat for each line i in k1
>>      if lineoffset(i,k2) <> 0 then
>>        put i & CR after new_ list
>>      end if
>>    end repeat
>> delete char -1 of new_ list
>> return new_ list
>> ...
>> 
>> 
>> Best
>> 
>> Klaus Major
>> klaus at major-k.de
>> http://www.major-k.de
>> 
>> 
>> _______________________________________________
>> use-revolution mailing list
>> use-revolution at lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription
>> preferences:
>> http://lists.runrev.com/mailman/listinfo/use-revolution





More information about the use-livecode mailing list