Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

Tyagi, Hemant; Kyrillidis, Anastasios; Gärtner, Bernd; Krause, Andreas

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1605

Computer Science > Machine Learning

Title: Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

Authors: Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner, Andreas Krause

(Submitted on 2 May 2016 (this version), latest version 8 May 2017 (v3))

Abstract: A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}\phi_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $\phi$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allows for the presence of a sparse number of second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, with $|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$, the function $f$ is now assumed to be of the form: $\sum_{p \in \mathcal{S}_1}\phi_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}\phi_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$. Assuming we have the freedom to query $f$ anywhere in its domain, we derive efficient algorithms that provably recover $\mathcal{S}_1,\mathcal{S}_2$ with finite sample bounds. Our analysis covers the noiseless setting where exact samples of $f$ are obtained, and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d Gaussian noise and arbitrary but bounded noise. Our main methods for identification of $\mathcal{S}_2$ essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing based schemes. Once $\mathcal{S}_1, \mathcal{S}_2$ are known, we show how the individual components $\phi_p$, $\phi_{(l,l^{\prime})}$ can be estimated via additional queries of $f$, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.

Comments:	45 pages, 6 figures, preliminary version of this paper to appear in proceedings of AISTATS 2016 (available here: arxiv.org/abs/1604.05307). arXiv admin note: text overlap with arXiv:1604.05307
Subjects:	Machine Learning (cs.LG); Information Theory (cs.IT); Numerical Analysis (math.NA); Machine Learning (stat.ML)
Cite as:	arXiv:1605.00609 [cs.LG]
	(or arXiv:1605.00609v1 [cs.LG] for this version)

Submission history

From: Hemant Tyagi [view email]
[v1] Mon, 2 May 2016 18:32:19 GMT (283kb,D)
[v2] Fri, 5 May 2017 14:47:25 GMT (283kb,D)
[v3] Mon, 8 May 2017 15:44:45 GMT (288kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1605.00609v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

Submission history