We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Extraction of Pharmacokinetic Evidence of Drug-drug Interactions from the Literature

Abstract: Drug-drug interactions (DDIs) are major causes of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. While evidence for DDI ranges in scale from intracellular biochemistry to human populations, literature mining methods have not been used to extract specific types of experimental evidence which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDIs ... We used a manually curated corpus of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining in classifying PubMed abstracts containing pharmacokinetic evidence for DDIs, as well as extracting sentences containing such evidence. We implemented a text mining pipeline using several linear classifiers and a variety of feature transformation methods. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, from various publicly-available named entity recognizers and from pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F1 ~= 0.93, MCC ~= 0.74, iAUC ~= 0.99) and sentences (F1 ~= 0.76, MCC ~= 0.65, iAUC ~= 0.83). We found that word-bigram textual features were important for achieving optimal classifier performance, that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification, and that some drug-related entity named recognition tools and dictionaries led to slight but significant improvements, especially in classification of evidence sentences. ...
Subjects: Machine Learning (stat.ML); Information Retrieval (cs.IR); Quantitative Methods (q-bio.QM)
ACM classes: H.2.8; H.3.1; J.3
Cite as: arXiv:1412.0744 [stat.ML]
  (or arXiv:1412.0744v1 [stat.ML] for this version)

Submission history

From: Artemy Kolchinsky [view email]
[v1] Tue, 2 Dec 2014 00:01:39 GMT (967kb,D)
[v2] Mon, 18 May 2015 16:45:42 GMT (621kb,D)

Link back to: arXiv, form interface, contact.