We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Extracting a Knowledge Base of COVID-19 Events from Social Media

Abstract: In this paper, we present a manually annotated corpus of 10,000 tweets containing public reports of five COVID-19 events, including positive and negative tests, deaths, denied access to testing, claimed cures and preventions. We designed slot-filling questions for each event type and annotated a total of 31 fine-grained slots, such as the location of events, recent travel, and close contacts. We show that our corpus can support fine-tuning BERT-based classifiers to automatically extract publicly reported events and help track the spread of a new disease. We also demonstrate that, by aggregating events extracted from millions of tweets, we achieve surprisingly high precision when answering complex queries, such as "Which organizations have employees that tested positive in Philadelphia?" We will release our corpus (with user-information removed), automatic extraction models, and the corresponding knowledge base to the research community.
Comments: Accepted at COLING 2022
Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)
Cite as: arXiv:2006.02567 [cs.CL]
  (or arXiv:2006.02567v4 [cs.CL] for this version)

Submission history

From: Shi Zong [view email]
[v1] Wed, 3 Jun 2020 22:39:24 GMT (980kb,D)
[v2] Wed, 24 Jun 2020 16:29:20 GMT (983kb,D)
[v3] Thu, 4 Nov 2021 05:21:15 GMT (16169kb,D)
[v4] Fri, 9 Sep 2022 07:21:03 GMT (2613kb,D)

Link back to: arXiv, form interface, contact.