The Landscape of Deep Learning Algorithms

Zhou, Pan; Feng, Jiashi

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 1705

Statistics > Machine Learning

Title: The Landscape of Deep Learning Algorithms

Authors: Pan Zhou, Jiashi Feng

(Submitted on 19 May 2017 (v1), last revised 5 Aug 2017 (this version, v2))

Abstract: This paper studies the landscape of empirical risk of deep neural networks by theoretically analyzing its convergence behavior to the population risk as well as its stationary points and properties. For an $l$-layer linear neural network, we prove its empirical risk uniformly converges to its population risk at the rate of $\mathcal{O}(r^{2l}\sqrt{d\log(l)}/\sqrt{n})$ with training sample size of $n$, the total weight dimension of $d$ and the magnitude bound $r$ of weight of each layer. We then derive the stability and generalization bounds for the empirical risk based on this result. Besides, we establish the uniform convergence of gradient of the empirical risk to its population counterpart. We prove the one-to-one correspondence of the non-degenerate stationary points between the empirical and population risks with convergence guarantees, which describes the landscape of deep neural networks. In addition, we analyze these properties for deep nonlinear neural networks with sigmoid activation functions. We prove similar results for convergence behavior of their empirical risks as well as the gradients and analyze properties of their non-degenerate stationary points.
To our best knowledge, this work is the first one theoretically characterizing landscapes of deep learning algorithms. Besides, our results provide the sample complexity of training a good deep neural network. We also provide theoretical understanding on how the neural network depth $l$, the layer width, the network size $d$ and parameter magnitude determine the neural network landscapes.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:1705.07038 [stat.ML]
	(or arXiv:1705.07038v2 [stat.ML] for this version)

Submission history

From: Pan Zhou [view email]
[v1] Fri, 19 May 2017 15:07:07 GMT (900kb)
[v2] Sat, 5 Aug 2017 12:30:25 GMT (915kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1705.07038

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: The Landscape of Deep Learning Algorithms

Submission history