Current browse context:
cs.LG
Change to browse by:
References & Citations
Statistics > Machine Learning
Title: Guarantees for Tuning the Step Size using a Learning-to-Learn Approach
(Submitted on 30 Jun 2020 (v1), last revised 11 Jun 2021 (this version, v2))
Abstract: Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach -- using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates -- was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the na\"ive objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues. We also characterize when it is necessary to compute the meta-objective on a separate validation set to ensure the generalization performance of the learned optimizer. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.
Submission history
From: Xiang Wang [view email][v1] Tue, 30 Jun 2020 02:59:35 GMT (256kb)
[v2] Fri, 11 Jun 2021 04:21:42 GMT (1355kb)
Link back to: arXiv, form interface, contact.