We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Label scarcity in biomedicine: Data-rich latent factor discovery enhances phenotype prediction

Abstract: High-quality data accumulation is now becoming ubiquitous in the health domain. There is increasing opportunity to exploit rich data from normal subjects to improve supervised estimators in specific diseases with notorious data scarcity. We demonstrate that low-dimensional embedding spaces can be derived from the UK Biobank population dataset and used to enhance data-scarce prediction of health indicators, lifestyle and demographic characteristics. Phenotype predictions facilitated by Variational Autoencoder manifolds typically scaled better with increasing unlabeled data than dimensionality reduction by PCA or Isomap. Performances gains from semisupervison approaches will probably become an important ingredient for various medical data science applications.
Comments: Accepted at NIPS 2017 Workshop on Machine Learning for Health
Subjects: Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
Cite as: arXiv:2110.06135 [cs.LG]
  (or arXiv:2110.06135v1 [cs.LG] for this version)

Submission history

From: Marc-Andre Schulz [view email]
[v1] Tue, 12 Oct 2021 16:25:50 GMT (219kb,D)

Link back to: arXiv, form interface, contact.