Current browse context:
math.OC
Change to browse by:
References & Citations
Mathematics > Optimization and Control
Title: On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods
(Submitted on 22 Sep 2021 (v1), last revised 4 Jul 2023 (this version, v2))
Abstract: In this study, we demonstrate that the norm test and inner product/orthogonality test presented in \cite{Bol18} are equivalent in terms of the convergence rates associated with Stochastic Gradient Descent (SGD) methods if $\epsilon^2=\theta^2+\nu^2$ with specific choices of $\theta$ and $\nu$. Here, $\epsilon$ controls the relative statistical error of the norm of the gradient while $\theta$ and $\nu$ control the relative statistical error of the gradient in the direction of the gradient and in the direction orthogonal to the gradient, respectively. Furthermore, we demonstrate that the inner product/orthogonality test can be as inexpensive as the norm test in the best case scenario if $\theta$ and $\nu$ are optimally selected, but the inner product/orthogonality test will never be more computationally affordable than the norm test if $\epsilon^2=\theta^2+\nu^2$. Finally, we present two stochastic optimization problems to illustrate our results.
Submission history
From: Luis Espath [view email][v1] Wed, 22 Sep 2021 18:01:15 GMT (159kb,D)
[v2] Tue, 4 Jul 2023 10:20:34 GMT (159kb,D)
Link back to: arXiv, form interface, contact.