Near Text Search

dunbarx at aol.com dunbarx at aol.com
Wed Feb 26 21:14:09 EST 2014


Hi


If I get what you want, why not just append the ASCII values to create a much longer number?:


on mouseup
   answer textToNum("abcdefgh")
end mouseup


function textToNum theString
  put lower(theString)
   repeat for each char theAscii in theString
      put charToNum(theAscii) after theAsciiCode
   end repeat
   return theAsciiCode
end textToNum


I have made several Scrabble utilities, and for most, I first alphabetize the chars in each word, and then do exactly what the above function does.


Craig




-
From: Bob Sneidar <bobsneidar at iotecdigital.com>
To: How to use LiveCode <use-livecode at lists.runrev.com>
Sent: Wed, Feb 26, 2014 9:06 pm
Subject: Near Text Search


Hi all.

I’m trying to devise a way to implement a “near text search” when querying a 
mySQL database. My problem is, I get spreadsheet forms filled out by hand from 
our dispatcher, and he sometimes makes typos, just small ones, and I need to 
ensure there are no virtual duplicate customer records in my application. So I 
need to query the database in sic a way that I come up with the nearest 
neighbor. I could do this easily in Foxpro, because they provide an argument for 
it, but I’ve searched around and no one seems to be able to produce a nearest 
neighbor search for text! You can do it for numbers, just not text. 

So now I’m trying to devise a way to convert a string to a number in such a way 
that the likelihood there could be a match would be extremely unlikely. So far 
I’ve come up with this:

function textToNum theString
  put lower(theString)
   put 1 into theSeed
   repeat for each char theAscii in theString
      put charToNum(theAscii) into theAsciiCode
      add (theAsciiCode*theSeed) to theNum
      add 1 to theSeed
   end repeat
   return theNum
end textToNum

The idea is that each character position would be multiplied by a seed value 
representing it’s position in the string. However I can foresee that it would be 
statistically possible to get pretty close and even get a match for two 
completely different strings. I *could* use a seed value equal to the number of 
lower case printable characters in the lower ascii table, but that could produce 
HUGE numbers and I am afraid of overflows. 

Any thoughts?

Bob
_______________________________________________
use-livecode mailing list
use-livecode at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

 



More information about the use-livecode mailing list