Thursday, November 15, 2012

Robot Life, Vol 3

Since the last post I've been working on some audio processing tests. The goal is to have Nao play along with improvised music jamming! I'll detail some of the experiments and progress thus far.

The high-level flow is to listen to nearby audio, determine tempo and musical scale, and then synthesize some melodies to match. The first step is to analyse the microphone audio to extract useful information using the Fast Fourier Transform.

The FFT lets us determine which frequencies exist in a raw audio waveform. The result is a set of linearly spaced frequency bins which tell us how much of each frequency range is present in the waveform.

The frequencies the FFT can detect is inherently limited by the duration and sample rate of the waveform data that is processed. For example, if we read 20ms of waveform data we can process frequencies into (1 / 1000 * 20 * SampleRate) discrete bins. With a sample rate of 44100 samples per second (44khz), this gives us 882 bins of data. Each bin represents data within a frequency range of 50hz (44100 / 882). Due to the Nyquist Limit, only the lower half of these bins will be usable, giving us 441 bins of usable data. We can use these bins to determine which musical notes are present based on how much of their frequency we detected.

frequency bins

Humans perceive sound on a logarithmic scale, which is not cleanly represented in our linearly spaced frequency bins from the FFT. Musical notes in low octaves are closer in frequency than musical notes at higher octaves, which means we lose accuracy in estimating notes.

For example, to determine C7 (2093hz), we examine the bin containing 2050-2100hz data. For the adjacent note C#7 (2217hz), we can check the bin for 2200-2250hz (reference). In our linearly spaced 50hz bins, these are several bins apart, so each note is cleanly stored in a single bin.

When we examine lower octaves, we can see the frequency difference between musical notes is smaller. For example, A2 and A#2 are 110hz and 116hz respectively, but with a bin spacing of 50hz, both these notes will end up mostly in the same frequency bin - we cannot distinguish them.

spacing of notes across frequency bins

With our precision of 50hz bins, it would seem that anything below A5 (880hz) or G#5 (830hz) is not accurately detectable. Any given note will contribute to adjacent bins with a lesser amount, which means we can still detect lower notes than this if we interpolate between several adjacent bins. For a 50hz bin spacing, notes can still be accurately detected down to steps of 15-20hz or so (around C4).

We can increase the number of frequency bins (and thus reduce the spacing, giving increased resolution) by increasing the length of sample data. This will decrease responsiveness since we need to wait longer for data before processing, so a good balance of parameters is necessary.

There are a few details worth paying attention to when implementing your waveform processing. Apply a Hanning Window to your sample data to prevent Spectral Leakage. Take care that your data at all steps is within value ranges that you expect. Work with input data of -1.0f to 1.0f, and use log() to display your FFT results in a more natural form for visualization/debugging. Retrieve local maxima of curves from the data of several adjacent bins for better estimation of low-frequency notes.

To determine musical scale from this data, simply tracking notes and trivially checking for best-match sets of notes is sufficient. Tracking notes over a much longer time period than your sample data lets you reach the correct musical scale within reasonable time. As always, parameters need tuning to balance responsiveness to scale changes and accuracy of scale detection.

So that's the general overview of how things are put together. The current musical note detection is fairly accurate for clean notes within a limited octave range. Human whistling and ocarina-style instruments give clean note detection. Guitars and pianos produce Harmonics over multiple frequencies which needs to be accounted for.

Next time I'll cover some of the trickery involved in generating synthesized audio data to play specific notes and melodies!


  1. || || || || || || || || || || || || || || || || || || || || || || || || |||||||| || || || |||||||||||||||||||||||||||||||||||||||||| || || || || || || || || || || || || |||||||| || || || |||||||||||||||||||||||||||||||||||||||||| ||

  2. Greeting….wonderful article.. increase my knowledge and thank you.. regards jual mesin ro and jual tangki kimia
    Regards too from Tangki Fiberglass and jual septic tank
    Also regards jual flowmeter and jual atap bajaringan I’d like to share this post…. And Tangki Fiber

  3. The article is interesting to read the reviews , and is useful to review ALSO Adding Insight kita . Jual Bio Septic Tank Murah

  4. terimakasih banyak sudah berbagi, semoga saya bisa terus berkunjung ke website ini untuk membaca

    beberapa artikel lainnya. Supplier Biotech Septic Tank dan

    Harga Septic Tank Biotech

  5. Dalam game agen poker online uang asli android ini banyak anggota bandar poker terpercaya yang akan memperoleh bonus besar. Hadiah yang dapat anda peroleh dalam game bandar poker terpercaya ini ada juga banyak jumlahnya. Situs Judi Online Agen Poker Online Agen Taruhan Bola Bahkan juga service bandar poker terpercaya ini juga akan dapat memberi anda service paling baik yang diperlukan anggota yang ada didalamnya. Jadi silakan saja anda bermain dalam game bandar poker terpercaya ini agar anda dapat memperoleh kekayaan.

  6. Permainan bandar poker terpercaya ini akan dapat memberi pada anda semuanya bonus besar yang diperlukan membernya. Bahkan juga bila anda bermain sehari-hari dalam game bandar poker terpercaya ini jadi anda pastinya juga akan dapat menjangkau yang namanya hadiah menarik bandar poker terpercaya yang juga akan dapat anda peroleh dalam game bandar poker terpercaya ini dengan mudahnya. Rasakan banyak bonus besar dalam game bandar poker terpercaya yang mengagumkan istimewa ini.
    Daftar Agen Bola Terpercaya
    Agen Judi Bola Terpercaya
    Daftar Situs Judi Online Terpercaya
    Situs Judi Bola Resmi
    Bandar Judi Bola Terbesar Di Dunia

  7. Jampoker ialah Agen Dewa Poker yang menyiapkan permainan terlengkap serta menarik seperti CAPSA SUSUN, KIU KICK, Domino QQ serta Agen Poker Online Ceme Terbaik. Kami sebagai Agen Sakong Online juga menyiapkan DOMINO KIU KIU , CEME Online, BLACKJACK, CAPSA SUSUN. Semua permainan online uang asli itu bisa dimainkan lewat situs kami dan gadget anda android ataupun di ios, Mainkan Semua permainan yang kami siapkan di tempat ini hanya melalui cara mendaftar.

    Jampoker menyiapkan 7 permainan yang bisa anda mainkan cukup dengan menggunakan 1 id seperti Bandar ceme atau Ceme Keliling, Q Kick, Agen Dewa Poker, Live Poker, Omaha, Super 10, Dewa Poker dan Capsa Susun. Permainan itu adalah permainan yang sangat populer pada waktu ini di bidang Judi Online. Kami yakinkan jika kami melayani anda dengan sepenuh hati dapat di lihat dari langkah kami melayani anda selama 24jam. Semua operator kami sudah melakukan pelatihan terbaik untuk melayani semua pemain poker IDN dan poker online.

    Pelatihan terbaik yang sudah dilalui tidak sekedar dalam sisi pengetahuan dalam dunia judi, akan tetapi kami sudah memberi pelatihan pada semua operator kami untuk mempunyai sopan santun dalam menyikapi semua aduan dari para member kami. Tingkat kepuasan member jadi tolak ukur kami untuk selalu meningkatkan service kami untuk melayani anda. Jampoker sebagai Agen Poker IDN terbaik siap melayani anda dalam 24 jam penuh sehari-harinya.