We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

Abstract: Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration schemes built on Anderson mixing that improve the convergence of deep RL algorithms. Our main results establish a connection between Anderson mixing and quasi-Newton methods and prove that Anderson mixing increases the convergence radius of policy iteration schemes by an extra contraction factor. The key focus of the analysis roots in the fixed-point iteration nature of RL. We further propose a stabilization strategy by introducing a stable regularization term in Anderson mixing and a differentiable, non-expansive MellowMax operator that can allow both faster convergence and more stable behavior. Extensive experiments demonstrate that our proposed method enhances the convergence, stability, and performance of RL algorithms.
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as: arXiv:2110.08896 [cs.LG]
  (or arXiv:2110.08896v2 [cs.LG] for this version)

Submission history

From: Yafei Wang [view email]
[v1] Sun, 17 Oct 2021 19:07:25 GMT (295kb,D)
[v2] Wed, 20 Oct 2021 05:12:46 GMT (737kb,D)

Link back to: arXiv, form interface, contact.