Brazilian Portuguese Speech Recognition Using Wav2vec 2.0

Gris, Lucas Rafael Stefanel; Casanova, Edresson; de Oliveira, Frederico Santos; Soares, Anderson da Silva; Junior, Arnaldo Candido

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2107

Change to browse by:

Computer Science > Computation and Language

Title: Brazilian Portuguese Speech Recognition Using Wav2vec 2.0

Authors: Lucas Rafael Stefanel Gris, Edresson Casanova, Frederico Santos de Oliveira, Anderson da Silva Soares, Arnaldo Candido Junior

(Submitted on 23 Jul 2021 (v1), last revised 22 Dec 2021 (this version, v3))

Abstract: Deep learning techniques have been shown to be efficient in various tasks, especially in the development of speech recognition systems, that is, systems that aim to transcribe an audio sentence in a sequence of written words. Despite the progress in the area, speech recognition can still be considered difficult, especially for languages lacking available data, such as Brazilian Portuguese (BP). In this sense, this work presents the development of an public Automatic Speech Recognition (ASR) system using only open available audio data, from the fine-tuning of the Wav2vec 2.0 XLSR-53 model pre-trained in many languages, over BP data. The final model presents an average word error rate of 12.4% over 7 different datasets (10.5% when applying a language model). According to our knowledge, the obtained error is the lowest among open end-to-end (E2E) ASR models for BP.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2107.11414 [cs.CL]
	(or arXiv:2107.11414v3 [cs.CL] for this version)

Submission history

From: Lucas Gris [view email]
[v1] Fri, 23 Jul 2021 18:54:39 GMT (609kb,D)
[v2] Sun, 28 Nov 2021 18:09:38 GMT (610kb,D)
[v3] Wed, 22 Dec 2021 16:29:54 GMT (613kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2107.11414

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Brazilian Portuguese Speech Recognition Using Wav2vec 2.0

Submission history