We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Artificial Intelligence

Title: Relationship extraction for knowledge graph creation from biomedical literature

Abstract: Biomedical research is growing at such an exponential pace that scientists, researchers, and practitioners are no more able to cope with the amount of published literature in the domain. The knowledge presented in the literature needs to be systematized in such a way that claims and hypotheses can be easily found, accessed, and validated. Knowledge graphs can provide such a framework for semantic knowledge representation from literature. However, in order to build a knowledge graph, it is necessary to extract knowledge as relationships between biomedical entities and normalize both entities and relationship types. In this paper, we present and compare a few rule-based and machine learning-based (Naive Bayes, Random Forests as examples of traditional machine learning methods and DistilBERT and T5-based models as examples of modern deep learning transformers) methods for scalable relationship extraction from biomedical literature, and for the integration into the knowledge graphs. We examine how resilient are these various methods to unbalanced and fairly small datasets. Our experiments show that transformer-based models handle well both small (due to pre-training on a large dataset) and unbalanced datasets. The best performing model was the DistilBERT-based model fine-tuned on balanced data, with a reported F1-score of 0.89.
Comments: Paper submitted to Journal of Semantic Web
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as: arXiv:2201.01647 [cs.AI]
  (or arXiv:2201.01647v3 [cs.AI] for this version)

Submission history

From: Nikola Milošević Dr [view email]
[v1] Wed, 5 Jan 2022 15:09:33 GMT (560kb,D)
[v2] Wed, 30 Mar 2022 14:56:45 GMT (571kb,D)
[v3] Mon, 18 Apr 2022 14:58:37 GMT (875kb,D)
[v4] Sun, 7 Aug 2022 10:58:29 GMT (598kb,D)

Link back to: arXiv, form interface, contact.