Word Alignment in the Era of Deep Learning: A Tutorial

Li, Bryan

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2212

Change to browse by:

Computer Science > Computation and Language

Title: Word Alignment in the Era of Deep Learning: A Tutorial

Authors: Bryan Li

(Submitted on 30 Nov 2022)

Abstract: The word alignment task, despite its prominence in the era of statistical machine translation (SMT), is niche and under-explored today. In this two-part tutorial, we argue for the continued relevance for word alignment. The first part provides a historical background to word alignment as a core component of the traditional SMT pipeline. We zero-in on GIZA++, an unsupervised, statistical word aligner with surprising longevity. Jumping forward to the era of neural machine translation (NMT), we show how insights from word alignment inspired the attention mechanism fundamental to present-day NMT. The second part shifts to a survey approach. We cover neural word aligners, showing the slow but steady progress towards surpassing GIZA++ performance. Finally, we cover the present-day applications of word alignment, from cross-lingual annotation projection, to improving translation.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2212.00138 [cs.CL]
	(or arXiv:2212.00138v1 [cs.CL] for this version)

Submission history

From: Bryan Li [view email]
[v1] Wed, 30 Nov 2022 22:07:34 GMT (7072kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2212.00138

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Word Alignment in the Era of Deep Learning: A Tutorial

Submission history