Multiple Confidence Gates For Joint Training Of SE And ASR

Wang, Tianrui; Zhu, Weibin; Gao, Yingying; Feng, Junlan; Zhang, Shilei

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2204

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Multiple Confidence Gates For Joint Training Of SE And ASR

Authors: Tianrui Wang, Weibin Zhu, Yingying Gao, Junlan Feng, Shilei Zhang

(Submitted on 1 Apr 2022)

Abstract: Joint training of speech enhancement model (SE) and speech recognition model (ASR) is a common solution for robust ASR in noisy environments. SE focuses on improving the auditory quality of speech, but the enhanced feature distribution is changed, which is uncertain and detrimental to the ASR. To tackle this challenge, an approach with multiple confidence gates for jointly training of SE and ASR is proposed. A speech confidence gates prediction module is designed to replace the former SE module in joint training. The noisy speech is filtered by gates to obtain features that are easier to be fitting by the ASR network. The experimental results show that the proposed method has better performance than the traditional robust speech recognition system on test sets of clean speech, synthesized noisy speech, and real noisy speech.

Comments:	5 pages
Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2204.00226 [eess.AS]
	(or arXiv:2204.00226v1 [eess.AS] for this version)

Submission history

From: Tianrui Wang [view email]
[v1] Fri, 1 Apr 2022 06:19:24 GMT (1430kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2204.00226v1

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Multiple Confidence Gates For Joint Training Of SE And ASR

Submission history