We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.OC

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Optimization and Control

Title: Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients

Abstract: Kakade's natural policy gradient method has been studied extensively in the last years showing linear convergence with and without regularization. We study another natural gradient method which is based on the Fisher information matrix of the state-action distributions and has received little attention from the theoretical side. Here, the state-action distributions follow the Fisher-Rao gradient flow inside the state-action polytope with respect to a linear potential. Therefore, we study Fisher-Rao gradient flows of linear programs more generally and show linear convergence with a rate that depends on the geometry of the linear program. Equivalently, this yields an estimate on the error induced by entropic regularization of the linear program which improves existing results. We extend these results and show sublinear convergence for perturbed Fisher-Rao gradient flows and natural gradient flows up to an approximation error. In particular, these general results cover the case of state-action natural policy gradients.
Comments: 27 pages, 4 figures, under review
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Systems and Control (eess.SY); Numerical Analysis (math.NA); Machine Learning (stat.ML)
MSC classes: 65K05, 90C05, 90C08, 90C40, 90C53
Cite as: arXiv:2403.19448 [math.OC]
  (or arXiv:2403.19448v1 [math.OC] for this version)

Submission history

From: Johannes Müller [view email]
[v1] Thu, 28 Mar 2024 14:16:23 GMT (2410kb,D)

Link back to: arXiv, form interface, contact.