Spectral similarity metrics for sound source formation based on the common variation cue
2010 (English)In: MULTIMEDIA TOOLS AND APPLICATIONS, ISSN 1380-7501, Vol. 48, no 1, 185-205 p.Article in journal (Refereed) Published
Scene analysis is a relevant way of gathering information about the structure of an audio stream. For content extraction purposes, it also provides prior knowledge that can be taken into account in order to provide more robust results for standard classification approaches. In order to perform such scene analysis, we believe that the notion of temporality is important. Consequently, we study in this paper a new way of modeling the evolution over time of the frequency and amplitude parameters of spectral components. We evaluate its benefits by considering its ability to automatically gather the components of the same sound source. The evaluation of the proposed metric shows that it achieves good performance and takes better account of micro-modulations.
Place, publisher, year, edition, pages
Springer Science Business Media , 2010. Vol. 48, no 1, 185-205 p.
Auditory scene analysis, Mid-level representation, Clustering, Common variation cue
Engineering and Technology
IdentifiersURN: urn:nbn:se:liu:diva-54854DOI: 10.1007/s11042-009-0382-9ISI: 000276079400011OAI: oai:DiVA.org:liu-54854DiVA: diva2:310860