Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement

Chen, Yu-Wen; Hirschberg, Julia; Tsao, Yu

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2309

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement

Authors: Yu-Wen Chen, Julia Hirschberg, Yu Tsao

(Submitted on 3 Sep 2023)

Abstract: Speech emotion recognition (SER) often experiences reduced performance due to background noise. In addition, making a prediction on signals with only background noise could undermine user trust in the system. In this study, we propose a Noise Robust Speech Emotion Recognition system, NRSER. NRSER employs speech enhancement (SE) to effectively reduce the noise in input signals. Then, the signal-to-noise-ratio (SNR)-level detection structure and waveform reconstitution strategy are introduced to reduce the negative impact of SE on speech signals with no or little background noise. Our experimental results show that NRSER can effectively improve the noise robustness of the SER system, including preventing the system from making emotion recognition on signals consisting solely of background noise. Moreover, the proposed SNR-level detection structure can be used individually for tasks such as data selection.

Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:2309.01164 [eess.AS]
	(or arXiv:2309.01164v1 [eess.AS] for this version)

Submission history

From: Yu-Wen Chen [view email]
[v1] Sun, 3 Sep 2023 13:00:04 GMT (690kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2309.01164

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement

Submission history