We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Digital Libraries

Title: CoCon: A Data Set on Combined Contextualized Research Artifact Use

Abstract: In the wake of information overload in academia, methodologies and systems for search, recommendation, and prediction to aid researchers in identifying relevant research are actively studied and developed. Existing work, however, is limited in terms of granularity, focusing only on the level of papers or a single type of artifact, such as data sets. To enable more holistic analyses and systems dealing with academic publications and their content, we propose CoCon, a large scholarly data set reflecting the combined use of research artifacts, contextualized in academic publications' full-text. Our data set comprises 35 k artifacts (data sets, methods, models, and tasks) and 340 k publications. We additionally formalize a link prediction task for "combined research artifact use prediction" and provide code to utilize analyses of and the development of ML applications on our data. All data and code is publicly available at this https URL
Comments: submitted to JCDL2023
Subjects: Digital Libraries (cs.DL); Computation and Language (cs.CL)
DOI: 10.1109/JCDL57899.2023.00016
Cite as: arXiv:2303.15193 [cs.DL]
  (or arXiv:2303.15193v1 [cs.DL] for this version)

Submission history

From: Tarek Saier [view email]
[v1] Mon, 27 Mar 2023 13:29:09 GMT (277kb,D)

Link back to: arXiv, form interface, contact.