References & Citations
Computer Science > Computation and Language
Title: Machine Translation System Selection from Bandit Feedback
(Submitted on 22 Feb 2020 (v1), last revised 2 Sep 2020 (this version, v2))
Abstract: Adapting machine translation systems in the real world is a difficult problem. In contrast to offline training, users cannot provide the type of fine-grained feedback (such as correct translations) typically used for improving the system. Moreover, different users have different translation needs, and even a single user's needs may change over time.
In this work we take a different approach, treating the problem of adaptation as one of selection. Instead of adapting a single system, we train many translation systems using different architectures, datasets, and optimization methods. Using bandit learning techniques on simulated user feedback, we learn a policy to choose which system to use for a particular translation task. We show that our approach can (1) quickly adapt to address domain changes in translation tasks, (2) outperform the single best system in mixed-domain translation tasks, and (3) make effective instance-specific decisions when using contextual bandit strategies.
Submission history
From: Jason Naradowsky [view email][v1] Sat, 22 Feb 2020 06:54:04 GMT (1132kb,D)
[v2] Wed, 2 Sep 2020 04:14:07 GMT (1673kb,D)
Link back to: arXiv, form interface, contact.