The Power of Normalization: Faster Evasion of Saddle Points

Levy, Kfir Y.

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1611

Computer Science > Machine Learning

Title: The Power of Normalization: Faster Evasion of Saddle Points

Authors: Kfir Y. Levy

(Submitted on 15 Nov 2016)

Abstract: A commonly used heuristic in non-convex optimization is Normalized Gradient Descent (NGD) - a variant of gradient descent in which only the direction of the gradient is taken into account and its magnitude ignored. We analyze this heuristic and show that with carefully chosen parameters and noise injection, this method can provably evade saddle points. We establish the convergence of NGD to a local minimum, and demonstrate rates which improve upon the fastest known first order algorithm due to Ge e al. (2015).
The effectiveness of our method is demonstrated via an application to the problem of online tensor decomposition; a task for which saddle point evasion is known to result in convergence to global minima.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:1611.04831 [cs.LG]
	(or arXiv:1611.04831v1 [cs.LG] for this version)

Submission history

From: Kfir Levy Yehuda [view email]
[v1] Tue, 15 Nov 2016 13:56:24 GMT (283kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1611.04831

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: The Power of Normalization: Faster Evasion of Saddle Points

Submission history