Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients

Müller, Johannes; Çaycı, Semih; Montúfar, Guido

Full-text links:

Download:

Current browse context:

math.OC

< prev | next >

new | recent | 2403

Mathematics > Optimization and Control

Title: Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients

Authors: Johannes Müller, Semih Çaycı, Guido Montúfar

(Submitted on 28 Mar 2024)

Abstract: Kakade's natural policy gradient method has been studied extensively in the last years showing linear convergence with and without regularization. We study another natural gradient method which is based on the Fisher information matrix of the state-action distributions and has received little attention from the theoretical side. Here, the state-action distributions follow the Fisher-Rao gradient flow inside the state-action polytope with respect to a linear potential. Therefore, we study Fisher-Rao gradient flows of linear programs more generally and show linear convergence with a rate that depends on the geometry of the linear program. Equivalently, this yields an estimate on the error induced by entropic regularization of the linear program which improves existing results. We extend these results and show sublinear convergence for perturbed Fisher-Rao gradient flows and natural gradient flows up to an approximation error. In particular, these general results cover the case of state-action natural policy gradients.

Comments:	27 pages, 4 figures, under review
Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG); Systems and Control (eess.SY); Numerical Analysis (math.NA); Machine Learning (stat.ML)
MSC classes:	65K05, 90C05, 90C08, 90C40, 90C53
Cite as:	arXiv:2403.19448 [math.OC]
	(or arXiv:2403.19448v1 [math.OC] for this version)

Submission history

From: Johannes Müller [view email]
[v1] Thu, 28 Mar 2024 14:16:23 GMT (2410kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> math > arXiv:2403.19448

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Mathematics > Optimization and Control

Title: Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients

Submission history