Current browse context:
cs.CL
Change to browse by:
References & Citations
Computer Science > Computation and Language
Title: Few-NERD: A Few-Shot Named Entity Recognition Dataset
(Submitted on 16 May 2021 (v1), last revised 1 Sep 2021 (this version, v6))
Abstract: Recently, considerable literature has grown up around the theme of few-shot named entity recognition (NER), but little published benchmark data specifically focused on the practical and challenging task. Current approaches collect existing supervised NER datasets and re-organize them to the few-shot setting for empirical study. These strategies conventionally aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type. To the best of our knowledge, this is the first few-shot NER dataset and the largest human-crafted NER dataset. We construct benchmark tasks with different emphases to comprehensively assess the generalization capability of models. Extensive empirical results and analysis show that Few-NERD is challenging and the problem requires further research. We make Few-NERD public at this https URL
Submission history
From: Ning Ding [view email][v1] Sun, 16 May 2021 15:53:17 GMT (555kb,D)
[v2] Wed, 19 May 2021 08:50:14 GMT (556kb,D)
[v3] Mon, 31 May 2021 06:56:03 GMT (5981kb,D)
[v4] Wed, 2 Jun 2021 07:23:06 GMT (5981kb,D)
[v5] Sun, 20 Jun 2021 14:55:18 GMT (5981kb,D)
[v6] Wed, 1 Sep 2021 05:35:32 GMT (5981kb,D)
Link back to: arXiv, form interface, contact.