We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Neural machine translation for low-resource languages

Abstract: Neural machine translation (NMT) approaches have improved the state of the art in many machine translation settings over the last couple of years, but they require large amounts of training data to produce sensible output. We demonstrate that NMT can be used for low-resource languages as well, by introducing more local dependencies and using word alignments to learn sentence reordering during translation. In addition to our novel model, we also present an empirical evaluation of low-resource phrase-based statistical machine translation (SMT) and NMT to investigate the lower limits of the respective technologies. We find that while SMT remains the best option for low-resource settings, our method can produce acceptable translations with only 70000 tokens of training data, a level where the baseline NMT system fails completely.
Comments: rejected from EMNLP 2017
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:1708.05729 [cs.CL]
  (or arXiv:1708.05729v1 [cs.CL] for this version)

Submission history

From: Jörg Tiedemann [view email]
[v1] Fri, 18 Aug 2017 18:16:23 GMT (16kb)

Link back to: arXiv, form interface, contact.