Sound Event Detection Using Duration Robust Loss Function

Akiyama, Daichi; Imoto, Keisuke; Tonami, Noriyuki; Okamoto, Yuki; Yamanishi, Ryosuke; Fukumori, Takahiro; Yamashita, Yoichi

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2006

Computer Science > Sound

Title: Sound Event Detection Using Duration Robust Loss Function

Authors: Daichi Akiyama, Keisuke Imoto, Noriyuki Tonami, Yuki Okamoto, Ryosuke Yamanishi, Takahiro Fukumori, Yoichi Yamashita

(Submitted on 27 Jun 2020)

Abstract: Many methods of sound event detection (SED) based on machine learning regard a segmented time frame as one data sample to model training. However, the sound durations of sound events vary greatly depending on the sound event class, e.g., the sound event ``fan'' has a long time duration, while the sound event ``mouse clicking'' is instantaneous. The difference in the time duration between sound event classes thus causes a serious data imbalance problem in SED. In this paper, we propose a method for SED using a duration robust loss function, which can focus model training on sound events of short duration. In the proposed method, we focus on a relationship between the duration of the sound event and the ease/difficulty of model training. In particular, many sound events of long duration (e.g., sound event ``fan'') are stationary sounds, which have less variation in their acoustic features and their model training is easy. Meanwhile, some sound events of short duration (e.g., sound event ``object impact'') have more than one audio pattern, such as attack, decay, and release parts. We thus apply a class-wise reweighting to the binary-cross entropy loss function depending on the ease/difficulty of model training. Evaluation experiments conducted using TUT Sound Events 2016/2017 and TUT Acoustic Scenes 2016 datasets show that the proposed method respectively improves the detection performance of sound events by 3.15 and 4.37 percentage points in macro- and micro-Fscores compared with a conventional method using the binary-cross entropy loss function.

Comments:	Submitted to DCASE2020 Workshop
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2006.15253 [cs.SD]
	(or arXiv:2006.15253v1 [cs.SD] for this version)

Submission history

From: Keisuke Imoto [view email]
[v1] Sat, 27 Jun 2020 01:49:25 GMT (4095kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.15253

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Sound Event Detection Using Duration Robust Loss Function

Submission history