Finding non-common elements in two arrays
Buster
wouter.abraham at scarlet.be
Sun Nov 6 19:05:58 EST 2005
"Jim Ault" <JimAultWins at yahoo.com> wrote:
> One catch I can see is the "set whole matches to true"
> also considering the false hits generated by your definition of a
> unique
> line (lower case, sub string, number format)
> "Mary had a little lamb" = line 6 of field 2
> "Mary had a little lamb, whose fleece was white" = line 8 of field 1
> line 6 of fld 2 is in line 8 of fld 1 => lineoffset would be > 0
>
> "234" & "2345" == offset match, lineOffset not
> "234" & "2,345" == offset match not, lineOffset not
> "234" & "2345.00 == offset match, lineOffset not
> "234" & "2345, 554, 234, 196" == lineoffset match twice
> "snow" & "snow shovel" & "snowbound" & "snow-bound"
>
> Jim Ault
> Las Vegas
Good catch :-)
"Alex Tweedly" <alex at tweedly.net > wrote:
-snip-
> > put fld "Field" & cr & "ZZZZZZZZZZ" into t1
> > put fld "Field" & cr & "test line" & cr & "ZZZZZZZZZZ" into t2
> >
> > put the millisecs into tStart
> > put 1 into i2
> > put the number of lines in t2 into limit2
> >
> > sort t1
>
> > sort t2
> > split t2 by CR
> > put t2[1] into L2
> >
> > repeat for each line L1 in t1
> > repeat while L2 < L1
> > add 1 to i2
> > put t2[i2] into L2
> > end repeat
> > if L2 = L1 then
> > -- put L1 & cr after tBoth
> > add 1 to i2
> > put t2[i2] into L2
> > else
> > -- put L1 & cr after t1only
> > end if
> > end repeat
> > if i2 < limit2 then
> > repeat with i = i2 to limit2-1
> > put t2[i] & cr after t2only
> > end repeat
> > end if
> > put "loop" && the millisecs - tStart & cr after msg
>
>
> P.S. I tried hard to break every one of Jerry's recommendation about
> variable naming as described in his excellent tutorial from the
> "Conference" session; if you haven't already downloaded and read that
> stack, you should. It *might* just stop you from writing such ugly
> code
> as I did above - but my old Fortran habits just keep coming back :-)
>
>
> --
> Alex Tweedly http://www.tweedly.net
The handler above is not giving correct results, neither on numeric
lists nor on word or mixed lists.
Follows a function which is a combination and adaptation of
techniques mentioned previously in this thread
### adapt the names of handler and the filtermodes to own taste
function intersectSpecial pList1,pList2,pMode
repeat for each line i in pList1
add 1 to a[i]
end repeat
repeat for each line i in pList2
add 2 to a[i]
end repeat
combine a with cr and tab
### elements only in pList1 --> 1
### elements only in pList2 --> 2
### elements in both lists --> 3
if pMode = "bothCommon" then put "*"&tab&"3" into tFilter
else if pMode = "uniqueA" then put "*"&tab&"1" into tFilter
else if pMode = "uniqueB" then put "*"&tab&"2" into tFilter
else if pMode = "bothUnique" then put "*"&tab&"1,*" &tab&"2" into
tFilter
repeat for each item tFilterString in tFilter
put a into b
filter b with tFilterString
replace char 2 to -1 of tFilterString with "" in b
put b & cr after tList
end repeat
return tList
end intersectSpecial
on mouseUp
put the millisecs into zap
put intersectSpecial(fld 1,fld 2,"bothUnique") into fld 3
put the millisecs - zap
end mouseUp
May be not a real speed monster but not bad either
(takes < 500 millisecs for 2 fields with > 25000 lines on an iMac G5
1.8 gHz)
Greetings,
Wouter
More information about the use-livecode
mailing list