Extracting pitch information from sound files

Len Morgan len-morgan at crcom.net
Fri Mar 7 08:01:02 EST 2008


I do this all the time for pitch correction of enthusiastic but less 
talented singers (I own a recording studio).  I use software that is 
designed specifically for the task but I need to not only get the pitch 
information but I also have to keep the formant information so that the 
phrasing stays the same when I change the pitch.

If you are ONLY interested in pitch information, your best bet is 
probably some sort of FFT library that would take the wav file and 
return a set of "buckets" that would tell you the amplitude (i.e., 
volume) of each frequency in a sound over a  sample period.  The pitch 
is USUALLY the highest amplitude in that sample set.

As others have suggested, this is a very math intensive task and while 
you COULD do it in Rev/Transcript, you'd be much better served with 
something a little faster and lower level.

It should be noted that if you are looking for something more than 
monotonic detection (i.e., only one pitch in the sound like a single 
note on a guitar vs a strum of all of the strings), this is not 
currently available (reliably any way) at any price.  Polyphonic sounds 
get to be very complex.  Human ears are much better at this than 
computers for the time being.

len morgan

David Glasgow wrote:
> The subject line pretty much says it all, but more specifically I want 
> to statistically analyse change in pitch, not play it, save it as 
> sound or relate it directly to any musical system.  So any kind of 
> rational number would be fine, and I would then chuck away the .wav 
> .aiff or whatever.
>
> 1/  How hard would it be to parse sound files recorded in Rev and 
> extract just the chunks of data relating to pitch ?
>
> 2/  Does it make any difference if the sound is complex (like an 
> animal call) or simple like a signal from a tone generator?
>
> 3/  Are any of the formats offered by Rev easier to handle in this 
> respect?
>
> 4/  Assuming standard bit rates, how much pitch data would be 
> generated by, say a ten second recording?
>
> 5/  I have settled for post hoc parsing rather than 'on the fly' 
> processing because I assumed the overhead would be too great for the 
> latter to work.  Is that right?
>
> 5/ Are there any other sensible questions I should be asking?
>
>
> Best Wishes,
>
> David Glasgow
> Carlton Glasgow Partnership
>
> http://www.i-psych.co.uk
>
> _______________________________________________
> use-revolution mailing list
> use-revolution at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your 
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>
>



More information about the use-livecode mailing list