Extracting pitch information from sound files

Fri Mar 7 07:31:39 EST 2008

On 7 Mar 2008, at 09:37, David Glasgow wrote:
>
> 1/  How hard would it be to parse sound files recorded in Rev and  
> extract just the chunks of data relating to pitch ?

The way sound is digitally recorded does not have chunks relating to  
pitch specifically. An audio file is essentially a long list of  
numbers that describe the changing amplitude of the waveform of sound  
- this theoretically encompasses all the properties of sound - pitch,  
timbre and timing.

>
> 2/  Does it make any difference if the sound is complex (like an  
> animal call) or simple like a signal from a tone generator?

Yes. The simpler the sound the easier it will be.
>
> 3/  Are any of the formats offered by Rev easier to handle in this  
> respect?

The difference between uncompressed sound file formats (.wav  
and .aiff for instance) is really only in the file headers - the  
audio data itself is generally the same.
>
> 4/  Assuming standard bit rates, how much pitch data would be  
> generated by, say a ten second recording?

Typically, uncompressed sound files store a certain number of samples  
per second per channel. The CD standard is 44100 samples per second,  
with each sample being a two-byte signed integer (the sample size).  
So ten seconds from a stereo CD would be 10 * 44100 * 2 * 2 = 3528000  
bytes.

>
> 5/  I have settled for post hoc parsing rather than 'on the fly'  
> processing because I assumed the overhead would be too great for  
> the latter to work.  Is that right?

Probably.
>
> 5/ Are there any other sensible questions I should be asking?
>

I'd probably start looking for any command-line tools that you could  
call from Rev. Maybe google 'pitch extraction', 'audio analysis' as a  
start.

Best,

Mark
>