References & Citations
Computer Science > Machine Learning
Title: Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
(Submitted on 13 May 2021 (v1), last revised 14 May 2021 (this version, v2))
Abstract: In this work, we explore a Connectionist Temporal Classification (CTC) based end-to-end Automatic Speech Recognition (ASR) model for the Myanmar language. A series of experiments is presented on the topology of the model in which the convolutional layers are added and dropped, different depths of bidirectional long short-term memory (BLSTM) layers are used and different label encoding methods are investigated. The experiments are carried out in low-resource scenarios using our recorded Myanmar speech corpus of nearly 26 hours. The best model achieves character error rate (CER) of 4.72% and syllable error rate (SER) of 12.38% on the test set.
Submission history
From: Khin Chit [view email][v1] Thu, 13 May 2021 12:58:51 GMT (479kb)
[v2] Fri, 14 May 2021 05:29:56 GMT (235kb)
Link back to: arXiv, form interface, contact.