We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SD

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Sound

Title: An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection

Abstract: Polyphonic sound event localization and detection (SELD), which jointly performs sound event detection (SED) and direction-of-arrival (DoA) estimation, detects the type and occurrence time of sound events as well as their corresponding DoA angles simultaneously. We study the SELD task from a multi-task learning perspective. Two open problems are addressed in this paper. Firstly, to detect overlapping sound events of the same type but with different DoAs, we propose to use a trackwise output format and solve the accompanying track permutation problem with permutation-invariant training. Multi-head self-attention is further used to separate tracks. Secondly, a previous finding is that, by using hard parameter-sharing, SELD suffers from a performance loss compared with learning the subtasks separately. This is solved by a soft parameter-sharing scheme. We term the proposed method as Event Independent Network V2 (EINV2), which is an improved version of our previously-proposed method and an end-to-end network for SELD. We show that our proposed EINV2 for joint SED and DoA estimation outperforms previous methods by a large margin, and has comparable performance to state-of-the-art ensemble models.
Comments: 5 pages, 2021 IEEE International Conference on Acoustics, Speech and Signal Processing
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as: arXiv:2010.13092 [cs.SD]
  (or arXiv:2010.13092v4 [cs.SD] for this version)

Submission history

From: Yin Cao [view email]
[v1] Sun, 25 Oct 2020 11:27:22 GMT (174kb,D)
[v2] Wed, 28 Oct 2020 16:41:06 GMT (174kb,D)
[v3] Sun, 7 Feb 2021 07:56:15 GMT (174kb,D)
[v4] Thu, 11 Feb 2021 04:44:23 GMT (178kb,D)

Link back to: arXiv, form interface, contact.