We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Applications

Title: Personalized Prognostic Models for Oncology: A Machine Learning Approach

Abstract: We have applied a little-known data transformation to subsets of the Surveillance, Epidemiology, and End Results (SEER) publically available data of the National Cancer Institute (NCI) to make it suitable input to standard machine learning classifiers. This transformation properly treats the right-censored data in the SEER data and the resulting Random Forest and Multi-Layer Perceptron models predict full survival curves. Treating the 6, 12, and 60 months points of the resulting survival curves as 3 binary classifiers, the 18 resulting classifiers have AUC values ranging from .765 to .885. Further evidence that the models have generalized well from the training data is provided by the extremely high levels of agreement between the random forest and neural network models predictions on the 6, 12, and 60 month binary classifiers.
Comments: 28 pages, 3 figures
Subjects: Applications (stat.AP); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:1606.07369 [stat.AP]
  (or arXiv:1606.07369v1 [stat.AP] for this version)

Submission history

From: David Dooling [view email]
[v1] Wed, 22 Jun 2016 15:55:22 GMT (776kb,D)

Link back to: arXiv, form interface, contact.