Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition

Fan, Zhiyun; Dong, Linhao; Shen, Chen; Liang, Zhenlin; Zhang, Jun; Lu, Lu; Ma, Zejun

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2306

Change to browse by:

Computer Science > Sound

Title: Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition

Authors: Zhiyun Fan, Linhao Dong, Chen Shen, Zhenlin Liang, Jun Zhang, Lu Lu, Zejun Ma

(Submitted on 8 Jun 2023)

Abstract: Code-switching speech recognition (CSSR) transcribes speech that switches between multiple languages or dialects within a single sentence. The main challenge in this task is that different languages often have similar pronunciations, making it difficult for models to distinguish between them. In this paper, we propose a method for solving the CSSR task from the perspective of language-specific acoustic boundary learning. We introduce language-specific weight estimators (LSWE) to model acoustic boundary learning in different languages separately. Additionally, a non-autoregressive (NAR) decoder and a language change detection (LCD) module are employed to assist in training. Evaluated on the SEAME corpus, our method achieves a state-of-the-art mixed error rate (MER) of 16.29% and 22.81% on the test_man and test_sge sets. We also demonstrate the effectiveness of our method on a 9000-hour in-house meeting code-switching dataset, where our method achieves a relatively 7.9% MER reduction.

Subjects:	Sound (cs.SD)
Cite as:	arXiv:2306.05279 [cs.SD]
	(or arXiv:2306.05279v1 [cs.SD] for this version)

Submission history

From: Zhiyun Fan [view email]
[v1] Thu, 8 Jun 2023 15:27:40 GMT (626kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2306.05279

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition

Submission history