Improving Accented Speech Recognition with Multi-Domain Training

Maison, Lucas; Estève, Yannick

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2303

Computer Science > Machine Learning

Title: Improving Accented Speech Recognition with Multi-Domain Training

Authors: Lucas Maison, Yannick Estève

(Submitted on 14 Mar 2023)

Abstract: Thanks to the rise of self-supervised learning, automatic speech recognition (ASR) systems now achieve near-human performance on a wide variety of datasets. However, they still lack generalization capability and are not robust to domain shifts like accent variations. In this work, we use speech audio representing four different French accents to create fine-tuning datasets that improve the robustness of pre-trained ASR models. By incorporating various accents in the training set, we obtain both in-domain and out-of-domain improvements. Our numerical experiments show that we can reduce error rates by up to 25% (relative) on African and Belgian accents compared to single-domain training while keeping a good performance on standard French.

Comments:	5 pages, 2 figures. Accepted to ICASSP 2023
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2303.07924 [cs.LG]
	(or arXiv:2303.07924v1 [cs.LG] for this version)

Submission history

From: Lucas Maison [view email]
[v1] Tue, 14 Mar 2023 14:10:16 GMT (92kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2303.07924

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Improving Accented Speech Recognition with Multi-Domain Training

Submission history