Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

Iiduka, Hideaki

Full-text links:

Download:

Current browse context:

math.OC

< prev | next >

new | recent | 2002

Mathematics > Optimization and Control

Title: Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

Authors: Hideaki Iiduka

(Submitted on 22 Feb 2020 (v1), last revised 22 Nov 2020 (this version, v4))

Abstract: This paper deals with nonconvex stochastic optimization problems in deep learning and provides appropriate learning rates with which adaptive learning rate optimization algorithms, such as Adam and AMSGrad, can approximate a stationary point of the problem. In particular, constant and diminishing learning rates are provided to approximate a stationary point of the problem. Our results also guarantee that the adaptive learning rate optimization algorithms can approximate global minimizers of convex stochastic optimization problems. The adaptive learning rate optimization algorithms are examined in numerical experiments on text and image classification. The experiments show that the algorithms with constant learning rates perform better than ones with diminishing learning rates.

Subjects:	Optimization and Control (math.OC)
MSC classes:	65K05, 90C25, 90C90, 92B20
Cite as:	arXiv:2002.09647 [math.OC]
	(or arXiv:2002.09647v4 [math.OC] for this version)

Submission history

From: Hideaki Iiduka [view email]
[v1] Sat, 22 Feb 2020 07:01:26 GMT (17kb)
[v2] Mon, 8 Jun 2020 09:35:50 GMT (18kb)
[v3] Sat, 4 Jul 2020 10:28:08 GMT (421kb)
[v4] Sun, 22 Nov 2020 15:39:47 GMT (833kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> math > arXiv:2002.09647

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Mathematics > Optimization and Control

Title: Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

Submission history