Depth Separation for Neural Networks

Daniely, Amit

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1702

Computer Science > Machine Learning

Title: Depth Separation for Neural Networks

Authors: Amit Daniely

(Submitted on 27 Feb 2017)

Abstract: Let $f:\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}\to\mathbb{S}$ be a function of the form $f(\mathbf{x},\mathbf{x}') = g(\langle\mathbf{x},\mathbf{x}'\rangle)$ for $g:[-1,1]\to \mathbb{R}$. We give a simple proof that shows that poly-size depth two neural networks with (exponentially) bounded weights cannot approximate $f$ whenever $g$ cannot be approximated by a low degree polynomial. Moreover, for many $g$'s, such as $g(x)=\sin(\pi d^3x)$, the number of neurons must be $2^{\Omega\left(d\log(d)\right)}$. Furthermore, the result holds w.r.t.\ the uniform distribution on $\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}$. As many functions of the above form can be well approximated by poly-size depth three networks with poly-bounded weights, this establishes a separation between depth two and depth three networks w.r.t.\ the uniform distribution on $\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}$.

Subjects:	Machine Learning (cs.LG); Computational Complexity (cs.CC); Machine Learning (stat.ML)
Cite as:	arXiv:1702.08489 [cs.LG]
	(or arXiv:1702.08489v1 [cs.LG] for this version)

Submission history

From: Amit Daniely [view email]
[v1] Mon, 27 Feb 2017 19:46:15 GMT (8kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1702.08489

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Depth Separation for Neural Networks

Submission history