We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings

Abstract: Simultaneous translation is vastly different from full-sentence translation, in the sense that it starts translation before the source sentence ends, with only a few words delay. However, due to the lack of large-scale, high-quality simultaneous translation datasets, most such systems are still trained on conventional full-sentence bitexts. This is far from ideal for the simultaneous scenario due to the abundance of unnecessary long-distance reorderings in those bitexts. We propose a novel method that rewrites the target side of existing full-sentence corpora into simultaneous-style translation. Experiments on Zh->En and Ja->En simultaneous translation show substantial improvements (up to +2.7 BLEU) with the addition of these generated pseudo-references.
Comments: 8 pages
Subjects: Computation and Language (cs.CL)
Journal reference: EMNLP 2021
Cite as: arXiv:2010.11247 [cs.CL]
  (or arXiv:2010.11247v2 [cs.CL] for this version)

Submission history

From: Renjie Zheng [view email]
[v1] Wed, 21 Oct 2020 19:03:06 GMT (383kb,D)
[v2] Thu, 23 Sep 2021 17:33:35 GMT (386kb,D)

Link back to: arXiv, form interface, contact.