We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Loss Prediction: End-to-End Active Learning Approach For Speech Recognition

Abstract: End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is complicated and expensive. Active learning is the solution by selecting the most valuable samples for annotation. In this paper, we proposed to use a predicted loss that estimates the uncertainty of the sample. The CTC (Connectionist Temporal Classification) and attention loss are informative for speech recognition since they are computed based on all decoding paths and alignments. We defined an end-to-end active learning pipeline, training an ASR/LP (Automatic Speech Recognition/Loss Prediction) joint model. The proposed approach was validated on an English and a Chinese speech recognition task. The experiments show that our approach achieves competitive results, outperforming random selection, least confidence, and estimated loss method.
Comments: Accepted to IJCNN 2021
Subjects: Audio and Speech Processing (eess.AS)
Cite as: arXiv:2107.04289 [eess.AS]
  (or arXiv:2107.04289v1 [eess.AS] for this version)

Submission history

From: Jian Luo [view email]
[v1] Fri, 9 Jul 2021 08:03:51 GMT (4845kb,D)

Link back to: arXiv, form interface, contact.