use-livecode Digest, Vol 171, Issue 44

Mark Waddingham mark at livecode.com
Thu Dec 28 09:42:55 EST 2017


Hi Peter,

On 2017-12-28 01:26, Peter Reid via use-livecode wrote:
> So far I have everything working apart from the comparison of 2 WAV
> files, in particular the following is working:

I can't really speak to the domain of application (not knowing very 
much, if anything about it). However, in terms of comparisons of audio 
clips in the way you suggest, then software like Rosetta Stone do this 
'kind of thing: Rosetta Stone has a section where you have to speak 
words and snippets in the language you are attempting to learn and it 
then determines 'how close' you are to how it should be said. How 
accurate this is I'm not entirely sure - but I think it sounds like 
exactly the same problem you are trying to solve.

Now, I'm not sure how Rosetta Stone actually does the analysis and 
comparisons - there obviously has to be some sort of normalization 
process, accounting for speed, pitch, volume etc.; and presumably to 
make it in any way 'useful' the individual syllables / vocalisations 
would have to be split up and then compared individually (trying to find 
out how many pieces of the spoken audio match up to the reference audio) 
- the latter would be what a percentage score could be based on.

I wonder if one of the online speech-to-text services could be used as 
the engine here. Both google and microsoft offer a cloud based service - 
you send it a clip of audio and it sends back a list of possible 
recognitions with confidence percentage. You could potentially use this:

   1) You submit the captured audio for the word/phrase which is being 
tested for, from this you will get a translation or list of potential 
translations all with a confidence percentage.

   2) For each potential translation returned, use the Levenshtein 
distance (https://en.wikipedia.org/wiki/Levenshtein_distance) algorithm 
to work out which is the 'most similar' to the reference word/phrase

   3) The percentage score for the test is then the confidence percentage 
of the closest match measured using the distance calculated in 2 
modulated using a suitable function (which would probably require some 
empirical testing) of the computed distance.

It sounds like an interesting small app - whilst there appears to be 
some dubiety (in the responses on the list) about potential 
effectiveness, seeing if it can help can't hurt, can it?

Warmest Regards,

Mark.

-- 
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps




More information about the use-livecode mailing list