Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Barakat, Anas; Bianchi, Pascal; Lehmann, Julien

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2106

Computer Science > Machine Learning

Title: Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Authors: Anas Barakat, Pascal Bianchi, Julien Lehmann

(Submitted on 14 Jun 2021 (v1), last revised 22 Feb 2022 (this version, v2))

Abstract: Actor-critic methods integrating target networks have exhibited a stupendous empirical success in deep reinforcement learning. However, a theoretical understanding of the use of target networks in actor-critic methods is largely missing in the literature. In this paper, we reduce this gap between theory and practice by proposing the first theoretical analysis of an online target-based actor-critic algorithm with linear function approximation in the discounted reward setting. Our algorithm uses three different timescales: one for the actor and two for the critic. Instead of using the standard single timescale temporal difference (TD) learning algorithm as a critic, we use a two timescales target-based version of TD learning closely inspired from practical actor-critic algorithms implementing target networks. First, we establish asymptotic convergence results for both the critic and the actor under Markovian sampling. Then, we provide a finite-time analysis showing the impact of incorporating a target network into actor-critic methods.

Comments:	50 pages
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Journal reference:	AISTATS 2022
Cite as:	arXiv:2106.07472 [cs.LG]
	(or arXiv:2106.07472v2 [cs.LG] for this version)

Submission history

From: Anas Barakat [view email]
[v1] Mon, 14 Jun 2021 14:59:05 GMT (45kb)
[v2] Tue, 22 Feb 2022 19:40:36 GMT (69kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.07472

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Submission history