Consistent Human Evaluation of Machine Translation across Language Pairs

Licht, Daniel; Gao, Cynthia; Lam, Janice; Guzman, Francisco; Diab, Mona; Koehn, Philipp

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2205

Change to browse by:

Computer Science > Computation and Language

Title: Consistent Human Evaluation of Machine Translation across Language Pairs

Authors: Daniel Licht, Cynthia Gao, Janice Lam, Francisco Guzman, Mona Diab, Philipp Koehn

(Submitted on 17 May 2022)

Abstract: Obtaining meaningful quality scores for machine translation systems through human evaluation remains a challenge given the high variability between human evaluators, partly due to subjective expectations for translation quality for different language pairs. We propose a new metric called XSTS that is more focused on semantic equivalence and a cross-lingual calibration method that enables more consistent assessment. We demonstrate the effectiveness of these novel contributions in large scale evaluation studies across up to 14 language pairs, with translation both into and out of English.

Comments:	10 pages
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2205.08533 [cs.CL]
	(or arXiv:2205.08533v1 [cs.CL] for this version)

Submission history

From: Philipp Koehn [view email]
[v1] Tue, 17 May 2022 17:57:06 GMT (266kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2205.08533

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Consistent Human Evaluation of Machine Translation across Language Pairs

Submission history