Finding non-common elements in two arrays

Jim MacConnell jmac at consensustech.com
Tue Nov 8 00:59:30 EST 2005


<Meant to send earlier but email issues prevented send.>

Bruce,
I¹m sure you¹ve had enough input no this but..... Thought a little more
couldn¹t hurt. Don't have time to see what changes would be required for
arrays but..... Assume it isn't going to present any difficulties

>From the Rev 2.2 Cookbook;
####################
Problem
You have two containers--variables, fields, or URLs--and you want to list
all the lines of text that are found in one of them, but not in both.

Solution:
function uncommonLines firstList,secondList
 on mouseUp
  get uncommonLines(field "List 1", field "List 2")
  if it is empty then answer "There are no differing lines."
  else answer it
end mouseUp
 
function uncommonLines firstList,secondList
  -- find lines in the first list but not the second list:
  repeat for each line thisLine in firstList
    if thisLine is not in secondList then -- include this line
      put "-" && thisLine & return after uncommon
    end if
  end repeat
  -- find lines in the second list but not the first list:
  repeat for each line thatLine in secondList
    if thatLine is not in firstList then -- include this line
      put "+" && thatLine & return after uncommon
    end if
  end repeat
  if last char of uncommon is return
  then delete last char of uncommon -- strip trailing return
  return uncommon
end uncommonLines


Discussion:
This custom function is called with a statement such as the following:

  put uncommonLines(field "Original",field "New") into myLines

The "uncommonLines" function takes two lists, finds the lines that are in
one list, but not in the other, and marks them with a leading "+" or "-"
depending on which of the two lists the line is found in. (If you start with
the first list, add the lines marked with "+", and delete the lines marked
with "-", you get the second list.)

The handler goes through each list in turn, and builds a third list
consisting of the lines that the lists don't have in common. First, it uses
a repeat control structure to scan each line in the firstList parameter. If
the line doesn't also appear in the secondList parameter, the handler adds
that line to a variable called "uncommon", prepending a "-" to the line with
the && operator. (The && operator places a space between the "-" and the
text of the line for better readability.) It also adds a return character so
that the next line added will start on a new line.

Next, the handler uses a similar repeat loop to scan each line in the
secondList. If the line isn't in the firstList, the handler adds the line to
the "uncommon" variable, prepending a "+" this time.

Since the handler adds a return character to "uncommon" after each uncommon
line it finds, the last character of "uncommon" is a trailing return. The
handler deletes this trailing return and then returns the contents of the
"uncommon" variable.

A note on efficiency:
To make this handler faster, we've used the repeat for each form of the
repeat control structure. A more traditional way to write the first repeat
loop would be like this:

  repeat with x = 1 to the number of lines in firstList
    if line x of firstList is not in secondList then
      put "-" && thisLine & return after uncommon
    end if
  end repeat

Many languages support this form of repeat loop, and Transcript does too:
this form will work the same way as the one we used. However, the repeat for
each form is much faster--depending on the circumstances, up to hundreds of
times faster--than the repeat with x = start to end form.
####################

Jim

PS: Hope I'm not violating some copyright terms by copying this on the list.


James H. MacConnell

Consensus Technology, LLC
www.consensustech.com




On Nov 5, 2005, at 2:39 PM, Bruce A. Pokras wrote:

The intersect function will return the common elements of two arrays. Is
there any array methodology that will return the non-common elements of two
arrays? I download a government list each month, and I would like an easy
way to spot the records that are either new or changed. Currently, I use a
repeat structure to go through the old list line-by-line to see if any lines
are not identically found in the new list (using offset), and collect a
third list of any lines in which offset=0. After splitting the old and new
lists into arrays, an "inverse-intersect" function or equivalent would seem
to be a more elegant way of doing this.

Any ideas?

Regards,

Bruce Pokras
Blazing Dawn Software
www.blazingdawn.com
_______________________________________________
use-revolution mailing list
use-revolution at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution







More information about the use-livecode mailing list