ANN Daily Crytoquote--my misspelling

Marielle Lange M.Lange at ed.ac.uk
Thu Jun 23 06:08:45 EDT 2005


>> My dictionary of 61,000 words comes in at 592 K--similar
>> to yours in size. The problem is that it includes a lot of words I've
>> never heard of. For example the dictionary begins with the following:
>
>> aardvark, aardwolf, aba, abaca, abacist, aback, abacus, abaft, abalone,
>> abamp, abampere, abandon, abandoned, abase, abash, abate, abatement,
>> abatis, abattoir, abaxial, abb, abba, abbacy, abbatial

>Well, FRELI won't help you there. It's got those too. Too bad we can't
>write a regex that means "take out everything obscure."

In the lexicall website, some databases have information about frequency and you
have the possibility to enter a range of values of your choosing... you can
select words to have a minimum frequency (trick these unusual words have a
frequency of -1 or 0; if you take all words with a frequency under 10 you are
pretty safe).

If you use the url below, you will directly get to see the list of words which
have a frequency of 10 or more.

http://lexicall.org/repository/results.php?mtd_file=data%2F2_words%2Fenglish%2Fdb_mrc.mtd&flds%5B1%5D=WORD&minvals%5B5%5D=10&submit=Submit

Make you enter it as a continuous line in your browser, the 500 words limit has
been removed and will remain so for a week, so you will see the full list on
your screen. Be patient - 10262 words.

In the query above, the 10 corresponds to the minimum frequency value. You can
try with higher and lower values and see when you get the list that best suits
your needs.

Marielle


More information about the use-livecode mailing list