We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Iterative Sound Source Localization for Unknown Number of Sources

Abstract: Sound source localization aims to seek the direction of arrival (DOA) of all sound sources from the observed multi-channel audio. For the practical problem of unknown number of sources, existing localization algorithms attempt to predict a likelihood-based coding (i.e., spatial spectrum) and employ a pre-determined threshold to detect the source number and corresponding DOA value. However, these threshold-based algorithms are not stable since they are limited by the careful choice of threshold. To address this problem, we propose an iterative sound source localization approach called ISSL, which can iteratively extract each source's DOA without threshold until the termination criterion is met. Unlike threshold-based algorithms, ISSL designs an active source detector network based on binary classifier to accept residual spatial spectrum and decide whether to stop the iteration. By doing so, our ISSL can deal with an arbitrary number of sources, even more than the number of sources seen during the training stage. The experimental results show that our ISSL achieves significant performance improvements in both DOA estimation and source number detection compared with the existing threshold-based algorithms.
Comments: Accepted by Interspeech 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
Cite as: arXiv:2206.12273 [eess.AS]
  (or arXiv:2206.12273v1 [eess.AS] for this version)

Submission history

From: Yanjie Fu [view email]
[v1] Fri, 24 Jun 2022 13:19:44 GMT (570kb,D)

Link back to: arXiv, form interface, contact.