We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: MobIE: A German Dataset for Named Entity Recognition, Entity Linking and Relation Extraction in the Mobility Domain

Abstract: We present MobIE, a German-language dataset, which is human-annotated with 20 coarse- and fine-grained entity types and entity linking information for geographically linkable entities. The dataset consists of 3,232 social media texts and traffic reports with 91K tokens, and contains 20.5K annotated entities, 13.1K of which are linked to a knowledge base. A subset of the dataset is human-annotated with seven mobility-related, n-ary relation types, while the remaining documents are annotated using a weakly-supervised labeling approach implemented with the Snorkel framework. To the best of our knowledge, this is the first German-language dataset that combines annotations for NER, EL and RE, and thus can be used for joint and multi-task learning of these fundamental information extraction tasks. We make MobIE public at this https URL
Comments: Accepted at KONVENS 2021. 5 pages, 3 figures, 5 tables
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2108.06955 [cs.CL]
  (or arXiv:2108.06955v2 [cs.CL] for this version)

Submission history

From: Leonhard Hennig [view email]
[v1] Mon, 16 Aug 2021 08:21:50 GMT (672kb,D)
[v2] Mon, 28 Mar 2022 09:40:12 GMT (672kb,D)

Link back to: arXiv, form interface, contact.