Distributed TD(0) with Almost No Communication

Liu, Rui; Olshevsky, Alex

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2104

Change to browse by:

Computer Science > Machine Learning

Title: Distributed TD(0) with Almost No Communication

Authors: Rui Liu, Alex Olshevsky

(Submitted on 16 Apr 2021 (v1), last revised 27 Jan 2022 (this version, v2))

Abstract: We provide a new non-asymptotic analysis of distributed TD(0) with linear function approximation. Our approach relies on "one-shot averaging," where $N$ agents run local copies of TD(0) and average the outcomes only once at the very end. We consider two models: one in which the agents interact with an environment they can observe and whose transitions depends on all of their actions (which we call the global state model), and one in which each agent can run a local copy of an identical Markov Decision Process, which we call the local state model.
In the global state model, we show that the convergence rate of our distributed one-shot averaging method matches the known convergence rate of TD(0). By contrast, the best convergence rate in the previous literature showed a rate which, according to the worst-case bounds given, could underperform the non-distributed version by $O(N^3)$ in terms of the number of agents $N$. In the local state model, we demonstrate a version of the linear time speedup phenomenon, where the convergence time of the distributed process is a factor of $N$ faster than the convergence time of TD(0). As far as we are aware, this is the first result rigorously showing benefits from parallelism for temporal difference methods.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2104.07855 [cs.LG]
	(or arXiv:2104.07855v2 [cs.LG] for this version)

Submission history

From: Rui Liu [view email]
[v1] Fri, 16 Apr 2021 02:21:11 GMT (36kb)
[v2] Thu, 27 Jan 2022 21:56:06 GMT (2306kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2104.07855

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Distributed TD(0) with Almost No Communication

Submission history