We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition

Abstract: In this work, we explore a Connectionist Temporal Classification (CTC) based end-to-end Automatic Speech Recognition (ASR) model for the Myanmar language. A series of experiments is presented on the topology of the model in which the convolutional layers are added and dropped, different depths of bidirectional long short-term memory (BLSTM) layers are used and different label encoding methods are investigated. The experiments are carried out in low-resource scenarios using our recorded Myanmar speech corpus of nearly 26 hours. The best model achieves character error rate (CER) of 4.72% and syllable error rate (SER) of 12.38% on the test set.
Comments: This is a preprint of the chapter: Chit K.M.M., Lin L.L., Exploring CTC Based End-To-End Techniques for Myanmar Speech Recognition, published in Advances in Intelligent Systems and Computing, vol 1324, edited by Vasant P., Zelinka I., Weber GW., 2021, Springer, Cham reproduced with permission of Springer. The final authenticated version is available at this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Journal reference: Advances in Intelligent Systems and Computing, vol 1324. Springer, Cham (2021)
DOI: 10.1007/978-3-030-68154-8_87
Cite as: arXiv:2105.06253 [cs.LG]
  (or arXiv:2105.06253v2 [cs.LG] for this version)

Submission history

From: Khin Chit [view email]
[v1] Thu, 13 May 2021 12:58:51 GMT (479kb)
[v2] Fri, 14 May 2021 05:29:56 GMT (235kb)

Link back to: arXiv, form interface, contact.