Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning

Kádár, Ákos; Chrupała, Grzegorz; Alishahi, Afra; Elliott, Desmond

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 1911

Computer Science > Computation and Language

Title: Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning

Authors: Ákos Kádár, Grzegorz Chrupała, Afra Alishahi, Desmond Elliott

(Submitted on 9 Nov 2019)

Abstract: Recent work has highlighted the advantage of jointly learning grounded sentence representations from multiple languages. However, the data used in these studies has been limited to an aligned scenario: the same images annotated with sentences in multiple languages. We focus on the more realistic disjoint scenario in which there is no overlap between the images in multilingual image--caption datasets. We confirm that training with aligned data results in better grounded sentence representations than training with disjoint data, as measured by image--sentence retrieval performance. In order to close this gap in performance, we propose a pseudopairing method to generate synthetically aligned English--German--image triplets from the disjoint sets. The method works by first training a model on the disjoint data, and then creating new triples across datasets using sentence similarity under the learned model. Experiments show that pseudopairs improve image--sentence retrieval performance compared to disjoint training, despite requiring no external data or models. However, we do find that using an external machine translation model to generate the synthetic data sets results in better performance.

Comments:	10 pages
Subjects:	Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1911.03678 [cs.CL]
	(or arXiv:1911.03678v1 [cs.CL] for this version)

Submission history

From: Desmond Elliott [view email]
[v1] Sat, 9 Nov 2019 12:34:01 GMT (2029kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1911.03678v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning

Submission history