References & Citations
Computer Science > Computation and Language
Title: NorNE: Annotating Named Entities for Norwegian
(Submitted on 27 Nov 2019 (v1), last revised 6 Mar 2020 (this version, v2))
Abstract: This paper presents NorNE, a manually annotated corpus of named entities which extends the annotation of the existing Norwegian Dependency Treebank. Comprising both of the official standards of written Norwegian (Bokm{\aa}l and Nynorsk), the corpus contains around 600,000 tokens and annotates a rich set of entity types including persons, organizations, locations, geo-political entities, products, and events, in addition to a class corresponding to nominals derived from names. We here present details on the annotation effort, guidelines, inter-annotator agreement and an experimental analysis of the corpus using a neural sequence labeling architecture.
Submission history
From: Erik Velldal [view email][v1] Wed, 27 Nov 2019 13:30:36 GMT (495kb,D)
[v2] Fri, 6 Mar 2020 09:38:01 GMT (500kb,D)
Link back to: arXiv, form interface, contact.