We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages

Abstract: Large generative language models have been very successful for English, but other languages lag behind, in part due to data and computational limitations. We propose a method that may overcome these problems by adapting existing pre-trained models to new languages. Specifically, we describe the adaptation of English GPT-2 to Italian and Dutch by retraining lexical embeddings without tuning the Transformer layers. As a result, we obtain lexical embeddings for Italian and Dutch that are aligned with the original English lexical embeddings. Additionally, we scale up complexity by transforming relearned lexical embeddings of GPT-2 small to the GPT-2 medium embedding space. This method minimises the amount of training and prevents losing information during adaptation that was learned by GPT-2. English GPT-2 models with relearned lexical embeddings can generate realistic sentences in Italian and Dutch. Though on average these sentences are still identifiable as artificial by humans, they are assessed on par with sentences generated by a GPT-2 model fully trained from scratch.
Comments: Findings of ACL 2021 Camera Ready
Subjects: Computation and Language (cs.CL)
Journal reference: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
DOI: 10.18653/v1/2021.findings-acl.74
Cite as: arXiv:2012.05628 [cs.CL]
  (or arXiv:2012.05628v3 [cs.CL] for this version)

Submission history

From: Wietse de Vries [view email]
[v1] Thu, 10 Dec 2020 12:27:16 GMT (121kb,D)
[v2] Sat, 22 May 2021 09:21:35 GMT (130kb,D)
[v3] Wed, 9 Jun 2021 07:57:32 GMT (130kb,D)

Link back to: arXiv, form interface, contact.