use-livecode Digest, Vol 171, Issue 44
mark at livecode.com
Thu Dec 28 09:42:55 EST 2017
On 2017-12-28 01:26, Peter Reid via use-livecode wrote:
> So far I have everything working apart from the comparison of 2 WAV
> files, in particular the following is working:
I can't really speak to the domain of application (not knowing very
much, if anything about it). However, in terms of comparisons of audio
clips in the way you suggest, then software like Rosetta Stone do this
'kind of thing: Rosetta Stone has a section where you have to speak
words and snippets in the language you are attempting to learn and it
then determines 'how close' you are to how it should be said. How
accurate this is I'm not entirely sure - but I think it sounds like
exactly the same problem you are trying to solve.
Now, I'm not sure how Rosetta Stone actually does the analysis and
comparisons - there obviously has to be some sort of normalization
process, accounting for speed, pitch, volume etc.; and presumably to
make it in any way 'useful' the individual syllables / vocalisations
would have to be split up and then compared individually (trying to find
out how many pieces of the spoken audio match up to the reference audio)
- the latter would be what a percentage score could be based on.
I wonder if one of the online speech-to-text services could be used as
the engine here. Both google and microsoft offer a cloud based service -
you send it a clip of audio and it sends back a list of possible
recognitions with confidence percentage. You could potentially use this:
1) You submit the captured audio for the word/phrase which is being
tested for, from this you will get a translation or list of potential
translations all with a confidence percentage.
2) For each potential translation returned, use the Levenshtein
distance (https://en.wikipedia.org/wiki/Levenshtein_distance) algorithm
to work out which is the 'most similar' to the reference word/phrase
3) The percentage score for the test is then the confidence percentage
of the closest match measured using the distance calculated in 2
modulated using a suitable function (which would probably require some
empirical testing) of the computed distance.
It sounds like an interesting small app - whilst there appears to be
some dubiety (in the responses on the list) about potential
effectiveness, seeing if it can help can't hurt, can it?
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps
More information about the Use-livecode