Knowledge Distillation for Quality Estimation

Gajbhiye, Amit; Fomicheva, Marina; Alva-Manchego, Fernando; Blain, Frédéric; Obamuyide, Abiola; Aletras, Nikolaos; Specia, Lucia

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2107

Change to browse by:

Computer Science > Computation and Language

Title: Knowledge Distillation for Quality Estimation

Authors: Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia

(Submitted on 1 Jul 2021)

Abstract: Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations. Recent success in QE stems from the use of multilingual pre-trained representations, where very large models lead to impressive results. However, the inference time, disk and memory requirements of such models do not allow for wide usage in the real world. Models trained on distilled pre-trained representations remain prohibitively large for many usage scenarios. We instead propose to directly transfer knowledge from a strong QE teacher model to a much smaller model with a different, shallower architecture. We show that this approach, in combination with data augmentation, leads to light-weight QE models that perform competitively with distilled pre-trained representations with 8x fewer parameters.

Comments:	ACL Findings 2021
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2107.00411 [cs.CL]
	(or arXiv:2107.00411v1 [cs.CL] for this version)

Submission history

From: Amit Gajbhiye [view email]
[v1] Thu, 1 Jul 2021 12:36:21 GMT (206kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2107.00411

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Knowledge Distillation for Quality Estimation

Submission history