SpellCheck (re-inventing the wheel)

Brian Yennie briany at qldlearning.com
Sat Jan 29 16:28:02 EST 2005


Hi,

You might check out something like this:
http://www.nist.gov/dads/HTML/doubleMetaphone.html

There are other similar algorithms also, but basically it would allow 
you to make phonetic matches.
You might also look at some of the links here:
http://aspell.sourceforge.net/

And here is another link to spell checker friendly word lists:
http://wordlist.sourceforge.net/

HTH!
- Brian

> Alex:
>
> Thanks *so* much for this... I really need it... if anyone gets their 
> head around some fast algorithms for  pulling up a short suggestion 
> list before I do, please post it.
>
> My context is that most words that the users will mis-spell are 
> specialized (sanskrit, tamil, names of obscure places e.g. "Chennai" 
> ). So, where an obvious mis-spelling like "mikl " the user will know 
> how to change to "milk"-- for the specialized words, she will need to 
> see a selection of choices...
>
> I know we could go "crazy" with this and code for auto replacement of 
> the mis-spelled word, etc. which adds a new layer of complexity,  but 
> for now I would be satisfied with manual user entry into a separate 
> stack where they could enter 1, 2, 3 initial chars and get a list of 
> words starting with those characters (99.9% of cases the first char is 
> assured), a click down could put that on their clip board and they 
> could paste it over the mis-spelled word. I would have our master 
> all-publicaitions lexicon entries loaded-appended to that global 
> variable gWords, to supplement  the main word list.
>
> Or, we *could* go nuts and do some repeat loop with a word offset and 
> replace all instances of the mis-spelled word if it exists etc..
>
> Sivakatirswami
> Himalayan Academy Publications
> at Kauai's Hindu Monastery
> katir at hindu.org
> www.HimalayanAcademy.com,
> www.HinduismToday.com
> www.Gurudeva.org
> www.Hindu.org
>
> On Jan 27, 2005, at 3:14 PM, Alex Tweedly wrote:
>
>> This took on average 8 millisecs per word in tWords. Perfectly 
>> adequate for small input "documents".
>>
>> Then I tried a slightly more complex way:
>> setup
>>
>>>   put url ("file:" & tFile) into gWords
>>>   repeat for each word w in gWords
>>>     put 1 into  gArray[w]
>>>   end repeat
>>
>> and then
>>
>>>   put tWords into field "inField"
>>>   put 0 into t
>>>   repeat for each word w in tWords
>>>     add 1 to t
>>>     replace "." with empty in w
>>>     replace "," with empty in w
>>>     replace "!" with empty in w
>>>     if gArray[w] <> 1 then
>>>       set the textstyle of word t of field "inField" to "bold"
>>>     end if
>>>   end repeat
>>
>> This took 2 millisecs for 50 words, so would be reasonable for even 
>> large-ish documents.
>
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
>



More information about the use-livecode mailing list