We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: On the Information Redundancy in Non-Autoregressive Translation

Abstract: Token repetition is a typical form of multi-modal problem in fully non-autoregressive translation (NAT). In this work, we revisit the multi-modal problem in recently proposed NAT models. Our study reveals that these advanced models have introduced other types of information redundancy errors, which cannot be measured by the conventional metric - the continuous repetition ratio. By manually annotating the NAT outputs, we identify two types of information redundancy errors that correspond well to lexical and reordering multi-modality problems. Since human annotation is time-consuming and labor-intensive, we propose automatic metrics to evaluate the two types of redundant errors. Our metrics allow future studies to evaluate new methods and gain a more comprehensive understanding of their effectiveness.
Comments: 10 pages, 10 tables
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2405.02673 [cs.CL]
  (or arXiv:2405.02673v1 [cs.CL] for this version)

Submission history

From: Zhihao Wang [view email]
[v1] Sat, 4 May 2024 14:20:28 GMT (42kb,D)

Link back to: arXiv, form interface, contact.