Wednesday, April 22, 2009


Today was a much more interesting day in terms of technical content (at least, to me). In the morning I went to Content-based Audio Processing session (at least the first half). All the presentations were good, but I liked "Interpolating Hidden Markov Model and its Application to Automatic Instrument Recognition" by Tumoas Virtanen and Toni Heittola the best. Essentially, the motivation is that hidden Markov models (HMMs) make an assumption that an audio element is decomposed into discrete states. In reality, there would be a smoother transition between states and while one can add states to the model to reduce the error, it requires a lot more data since there are more parameters. Their suggestion is to create auxillary states by using interpolation methods on the HMM parameters. They demonstrated a 5% absolute improvement over the baseline (no interpolation). However, their test database was isolated instrument recognition and I would have liked to see how their approach behaved in the presence of different noises and levels. Using discrete states can help with noise reduction by vector quantizing the state space. It's possibe that the presence of noise could take the state trajectory down an erroneous path. For the afternoon, I went to the poster sessions and there's plenty I could say, but I'd miss the rest of the conference. Instead I took pictures of the posters and they are presented below.

This is a plug for Emiru Tsunoo's paper titled "Rhythm Map: Extraction of Unit Rhythmic Patterns and Alanysis of Rhythmic Structure from Music Acoustic Signals." I've read the paper and it's very good work. I like the search for new and better features.

Emiru presenting his poster. He was crowded the entire time.

Dr. Sagayama explaining RhythmMap to Malcolm Slaney of Yahoo! and Stanford.

No comments: