Current browse context:
stat.ME
Change to browse by:
References & Citations
Statistics > Methodology
Title: $σ$-Ridge: group regularized ridge regression via empirical Bayes noise level cross-validation
(Submitted on 29 Oct 2020 (v1), last revised 4 Mar 2021 (this version, v2))
Abstract: Features in predictive models are not exchangeable, yet common supervised models treat them as such. Here we study ridge regression when the analyst can partition the features into $K$ groups based on external side-information. For example, in high-throughput biology, features may represent gene expression, protein abundance or clinical data and so each feature group represents a distinct modality. The analyst's goal is to choose optimal regularization parameters $\lambda = (\lambda_1, \dotsc, \lambda_K)$ -- one for each group. In this work, we study the impact of $\lambda$ on the predictive risk of group-regularized ridge regression by deriving limiting risk formulae under a high-dimensional random effects model with $p\asymp n$ as $n \to \infty$. Furthermore, we propose a data-driven method for choosing $\lambda$ that attains the optimal asymptotic risk: The key idea is to interpret the residual noise variance $\sigma^2$, as a regularization parameter to be chosen through cross-validation. An empirical Bayes construction maps the one-dimensional parameter $\sigma$ to the $K$-dimensional vector of regularization parameters, i.e., $\sigma \mapsto \widehat{\lambda}(\sigma)$. Beyond its theoretical optimality, the proposed method is practical and runs as fast as cross-validated ridge regression without feature groups ($K=1$).
Submission history
From: Nikolaos Ignatiadis [view email][v1] Thu, 29 Oct 2020 17:52:45 GMT (502kb,D)
[v2] Thu, 4 Mar 2021 10:05:36 GMT (500kb,D)
Link back to: arXiv, form interface, contact.