We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.IR

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Information Retrieval

Title: Pattern-based Acquisition of Scientific Entities from Scholarly Article Titles

Abstract: We describe a rule-based approach for the automatic acquisition of salient scientific entities from Computational Linguistics (CL) scholarly article titles. Two observations motivated the approach: (i) noting salient aspects of an article's contribution in its title; and (ii) pattern regularities capturing the salient terms that could be expressed in a set of rules. Only those lexico-syntactic patterns were selected that were easily recognizable, occurred frequently, and positionally indicated a scientific entity type. The rules were developed on a collection of 50,237 CL titles covering all articles in the ACL Anthology. In total, 19,799 research problems, 18,111 solutions, 20,033 resources, 1,059 languages, 6,878 tools, and 21,687 methods were extracted at an average precision of 75%.
Comments: 8 pages, Accepted for publication in ICADL 2021 as a short paper
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Digital Libraries (cs.DL)
Cite as: arXiv:2109.00199 [cs.IR]
  (or arXiv:2109.00199v2 [cs.IR] for this version)

Submission history

From: Jennifer D'Souza [view email]
[v1] Wed, 1 Sep 2021 05:59:06 GMT (388kb,D)
[v2] Fri, 17 Sep 2021 07:06:04 GMT (227kb)

Link back to: arXiv, form interface, contact.