We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: GLGE: A New General Language Generation Evaluation Benchmark

Abstract: Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP). These benchmarks mostly focus on a range of Natural Language Understanding (NLU) tasks, without considering the Natural Language Generation (NLG) models. In this paper, we present the General Language Generation Evaluation (GLGE), a new multi-task benchmark for evaluating the generalization capabilities of NLG models across eight language generation tasks. For each task, we continue to design three subtasks in terms of task difficulty (GLGE-Easy, GLGE-Medium, and GLGE-Hard). This introduces 24 subtasks to comprehensively compare model performance. To encourage research on pretraining and transfer learning on NLG models, we make GLGE publicly available and build a leaderboard with strong baselines including MASS, BART, and ProphetNet (The source code and dataset are publicly available at this https URL).
Comments: Findings of Association for Computational Linguistics. ACL 2021
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2011.11928 [cs.CL]
  (or arXiv:2011.11928v3 [cs.CL] for this version)

Submission history

From: Dayiheng Liu [view email]
[v1] Tue, 24 Nov 2020 06:59:45 GMT (189kb,D)
[v2] Wed, 19 May 2021 11:54:23 GMT (882kb,D)
[v3] Tue, 1 Jun 2021 08:01:50 GMT (858kb,D)

Link back to: arXiv, form interface, contact.