We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation

Abstract: In this paper, we introduce a high-quality and large-scale benchmark dataset for English-Vietnamese speech translation with 508 audio hours, consisting of 331K triplets of (sentence-lengthed audio, English source transcript sentence, Vietnamese target subtitle sentence). We also conduct empirical experiments using strong baselines and find that the traditional "Cascaded" approach still outperforms the modern "End-to-End" approach. To the best of our knowledge, this is the first large-scale English-Vietnamese speech translation study. We hope both our publicly available dataset and study can serve as a starting point for future research and applications on English-Vietnamese speech translation. Our dataset is available at this https URL
Comments: In Proceedings of INTERSPEECH 2022, to appear. The first three authors contributed equally to this work
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2208.04243 [cs.CL]
  (or arXiv:2208.04243v1 [cs.CL] for this version)

Submission history

From: Dat Quoc Nguyen [view email]
[v1] Mon, 8 Aug 2022 16:11:26 GMT (38kb,D)

Link back to: arXiv, form interface, contact.