Efficient Online Reinforcement Learning with Offline Data

Ball, Philip J.; Smith, Laura; Kostrikov, Ilya; Levine, Sergey

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2302

Computer Science > Machine Learning

Title: Efficient Online Reinforcement Learning with Offline Data

Authors: Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine

(Submitted on 6 Feb 2023 (v1), last revised 31 May 2023 (this version, v4))

Abstract: Sample efficiency and exploration remain major challenges in online reinforcement learning (RL). A powerful approach that can be applied to address these issues is the inclusion of offline data, such as prior trajectories from a human expert or a sub-optimal exploration policy. Previous methods have relied on extensive modifications and additional complexity to ensure the effective use of this data. Instead, we ask: can we simply apply existing off-policy methods to leverage offline data when learning online? In this work, we demonstrate that the answer is yes; however, a set of minimal but important changes to existing off-policy RL algorithms are required to achieve reliable performance. We extensively ablate these design choices, demonstrating the key factors that most affect performance, and arrive at a set of recommendations that practitioners can readily apply, whether their data comprise a small number of expert demonstrations or large volumes of sub-optimal trajectories. We see that correct application of these simple recommendations can provide a $\mathbf{2.5\times}$ improvement over existing approaches across a diverse set of competitive benchmarks, with no additional computational overhead. We have released our code at this https URL

Comments:	Short Presentation at ICML 2023; to reproduce our results and use our codebase, see this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2302.02948 [cs.LG]
	(or arXiv:2302.02948v4 [cs.LG] for this version)

Submission history

From: Philip Ball [view email]
[v1] Mon, 6 Feb 2023 17:30:22 GMT (5744kb,D)
[v2] Wed, 15 Feb 2023 13:06:10 GMT (5744kb,D)
[v3] Thu, 9 Mar 2023 18:59:27 GMT (5745kb,D)
[v4] Wed, 31 May 2023 10:52:56 GMT (3363kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2302.02948

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Efficient Online Reinforcement Learning with Offline Data

Submission history