We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts

Abstract: Determining coreference of concept mentions across multiple documents is a fundamental task in natural language understanding. Previous work on cross-document coreference resolution (CDCR) typically considers mentions of events in the news, which seldom involve abstract technical concepts that are prevalent in science and technology. These complex concepts take diverse or ambiguous forms and have many hierarchical levels of granularity (e.g., tasks and subtasks), posing challenges for CDCR. We present a new task of Hierarchical CDCR (H-CDCR) with the goal of jointly inferring coreference clusters and hierarchy between them. We create SciCo, an expert-annotated dataset for H-CDCR in scientific papers, 3X larger than the prominent ECB+ resource. We study strong baseline models that we customize for H-CDCR, and highlight challenges for future work.
Comments: Accepted to AKBC 2021. Data and code available at this https URL
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as: arXiv:2104.08809 [cs.CL]
  (or arXiv:2104.08809v3 [cs.CL] for this version)

Submission history

From: Arie Cattan [view email]
[v1] Sun, 18 Apr 2021 10:42:20 GMT (6495kb,D)
[v2] Fri, 27 Aug 2021 14:17:48 GMT (6545kb,D)
[v3] Wed, 1 Sep 2021 10:09:15 GMT (6546kb,D)

Link back to: arXiv, form interface, contact.