Language Models are Few-shot Multilingual Learners

Winata, Genta Indra; Madotto, Andrea; Lin, Zhaojiang; Liu, Rosanne; Yosinski, Jason; Fung, Pascale

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2109

Computer Science > Computation and Language

Title: Language Models are Few-shot Multilingual Learners

Authors: Genta Indra Winata, Andrea Madotto, Zhaojiang Lin, Rosanne Liu, Jason Yosinski, Pascale Fung

(Submitted on 16 Sep 2021)

Abstract: General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few examples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art cross-lingual models.

Comments:	14 pages
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2109.07684 [cs.CL]
	(or arXiv:2109.07684v1 [cs.CL] for this version)

Submission history

From: Genta Indra Winata [view email]
[v1] Thu, 16 Sep 2021 03:08:22 GMT (12105kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2109.07684

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Language Models are Few-shot Multilingual Learners

Submission history