Cooperative Actor-Critic via TD Error Aggregation

Figura, Martin; Lin, Yixuan; Liu, Ji; Gupta, Vijay

Full-text links:

Download:

Current browse context:

eess.SY

< prev | next >

new | recent | 2207

Electrical Engineering and Systems Science > Systems and Control

Title: Cooperative Actor-Critic via TD Error Aggregation

Authors: Martin Figura, Yixuan Lin, Ji Liu, Vijay Gupta

(Submitted on 25 Jul 2022)

Abstract: In decentralized cooperative multi-agent reinforcement learning, agents can aggregate information from one another to learn policies that maximize a team-average objective function. Despite the willingness to cooperate with others, the individual agents may find direct sharing of information about their local state, reward, and value function undesirable due to privacy issues. In this work, we introduce a decentralized actor-critic algorithm with TD error aggregation that does not violate privacy issues and assumes that communication channels are subject to time delays and packet dropouts. The cost we pay for making such weak assumptions is an increased communication burden for every agent as measured by the dimension of the transmitted data. Interestingly, the communication burden is only quadratic in the graph size, which renders the algorithm applicable in large networks. We provide a convergence analysis under diminishing step size to verify that the agents maximize the team-average objective function.

Subjects:	Systems and Control (eess.SY); Machine Learning (cs.LG)
Cite as:	arXiv:2207.12533 [eess.SY]
	(or arXiv:2207.12533v1 [eess.SY] for this version)

Submission history

From: Martin Figura [view email]
[v1] Mon, 25 Jul 2022 21:10:39 GMT (404kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2207.12533

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Systems and Control

Title: Cooperative Actor-Critic via TD Error Aggregation

Submission history