Current browse context:
cs.SD
Change to browse by:
References & Citations
Computer Science > Sound
Title: An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection
(Submitted on 25 Oct 2020 (v1), last revised 11 Feb 2021 (this version, v4))
Abstract: Polyphonic sound event localization and detection (SELD), which jointly performs sound event detection (SED) and direction-of-arrival (DoA) estimation, detects the type and occurrence time of sound events as well as their corresponding DoA angles simultaneously. We study the SELD task from a multi-task learning perspective. Two open problems are addressed in this paper. Firstly, to detect overlapping sound events of the same type but with different DoAs, we propose to use a trackwise output format and solve the accompanying track permutation problem with permutation-invariant training. Multi-head self-attention is further used to separate tracks. Secondly, a previous finding is that, by using hard parameter-sharing, SELD suffers from a performance loss compared with learning the subtasks separately. This is solved by a soft parameter-sharing scheme. We term the proposed method as Event Independent Network V2 (EINV2), which is an improved version of our previously-proposed method and an end-to-end network for SELD. We show that our proposed EINV2 for joint SED and DoA estimation outperforms previous methods by a large margin, and has comparable performance to state-of-the-art ensemble models.
Submission history
From: Yin Cao [view email][v1] Sun, 25 Oct 2020 11:27:22 GMT (174kb,D)
[v2] Wed, 28 Oct 2020 16:41:06 GMT (174kb,D)
[v3] Sun, 7 Feb 2021 07:56:15 GMT (174kb,D)
[v4] Thu, 11 Feb 2021 04:44:23 GMT (178kb,D)
Link back to: arXiv, form interface, contact.