We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Log-ratio Lasso: Scalable, Sparse Estimation for Log-ratio Models

Abstract: Positive-valued signal data is common in many biological and medical applications, where the data are often generated from imaging techniques such as mass spectrometry. In such a setting, the relative intensities of the raw features are often the scientifically meaningful quantities, so it is of interest to identify relevant features that take the form of log-ratios of the raw inputs. When including the log-ratios of all pairs of predictors, the dimensionality of this predictor space becomes large, so computationally efficient statistical procedures are required. We introduce an embedding of the log-ratio parameter space into a space of much lower dimension and develop efficient penalized fitting procedure using this more tractable representation. This procedure serves as the foundation for a two-step fitting procedure that combines a convex filtering step with a second non-convex pruning step to yield highly sparse solutions. On a cancer proteomics data set we find that these methods fit highly sparse models with log-ratio features of known biological relevance while greatly improving upon the predictive accuracy of less interpretable methods.
Subjects: Methodology (stat.ME)
Journal reference: Biometrics 109 (2019) 613-624
DOI: 10.1111/biom.12995
Cite as: arXiv:1709.01139 [stat.ME]
  (or arXiv:1709.01139v1 [stat.ME] for this version)

Submission history

From: Stephen Bates [view email]
[v1] Mon, 4 Sep 2017 20:04:39 GMT (95kb,D)

Link back to: arXiv, form interface, contact.