Finding common words and phrases in a large block of text?
Tom Glod
tom at makeshyft.com
Thu Oct 25 13:26:23 EDT 2018
Hi Terry, glad you found a solution.....
I have a similar challenge.
I did a word count, but would love to recognize the same phrases. Did you
just compare chunks? ... hash them? (probably redundant?)
Are there any more hints you can drop about this?
Thanks,
Tom
On Thu, Oct 25, 2018 at 4:27 AM Terry Judd via use-livecode <
use-livecode at lists.runrev.com> wrote:
> OK - was easier than I thought. I have something that works fast enough by
> iterating through runs of words in each sentence in a block of text,
> incrementing counts into an array and then sorting the contents of that
> array by phrase length and frequency.
>
> Terry...
>
> On 25/10/2018 4:56 pm, "use-livecode on behalf of Terry Judd via
> use-livecode" <use-livecode-bounces at lists.runrev.com on behalf of
> use-livecode at lists.runrev.com> wrote:
>
> Hi – I’m looking to analyse some large block of text (journal
> abstracts from key educational technology journals over a several year
> period) to find common words and phrases. Finding common words should be
> easy enough but I’m not clear on what approach to take for finding common
> phrases (iterating through the text capturing overlapping word runs of
> various lengths?). Any ideas on how best to proceed?
>
> TIA,
>
> Terry...
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
More information about the use-livecode
mailing list