Semi-automatic Index generation?
Eric Chatonet
eric.chatonet at sosmartsoftware.com
Wed Jul 30 10:18:48 EDT 2008
Bonjour David,
Le 30 juil. 08 à 16:08, David Bovill a écrit :
> Is there a resource/ index that any one knows of for plain
> uninteresting
> dull words. I want to take arbitrary chunks of text and search for
> "interesting" words - that is domain specific words that might be
> useful to
> links to create dictionary entries. This would mean creating a list
> of words
> and stripping "the" "it" etc. I am imagining it working like a
> spelling
> dictionary with the ability to manually edit entries - but I'd like
> a good
> starting list? Not sure what to search for :)
1. You might search for what is called 'stopwords' (non interesting
words) using any Internet search engine.
2. Have a look also at what is called 'stemming': http://
www.comp.lancs.ac.uk/computing/research/stemming/general/ that allow
to reduce different words to the same form.
3. I have put on RevOnline an English, French, Italian, Spanish,
German and Portuguese stemmer library (username: sosmartsoftware)
that could help you too.
Best regards from Paris,
Eric Chatonet.
----------------------------------------------------------------
Plugins and tutorials for Revolution: http://www.sosmartsoftware.com/
Email: eric.chatonet at sosmartsoftware.com/
----------------------------------------------------------------
More information about the use-livecode
mailing list