Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains

Bannon, James; Windsor, Brad; Song, Wenbo; Li, Tao

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2006

Computer Science > Machine Learning

Title: Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains

Authors: James Bannon, Brad Windsor, Wenbo Song, Tao Li

(Submitted on 3 Jun 2020)

Abstract: Reinforcement learning algorithms have had tremendous successes in online learning settings. However, these successes have relied on low-stakes interactions between the algorithmic agent and its environment. In many settings where RL could be of use, such as health care and autonomous driving, the mistakes made by most online RL algorithms during early training come with unacceptable costs. These settings require developing reinforcement learning algorithms that can operate in the so-called batch setting, where the algorithms must learn from set of data that is fixed, finite, and generated from some (possibly unknown) policy. Evaluating policies different from the one that collected the data is called off-policy evaluation, and naturally poses counter-factual questions. In this project we show how off-policy evaluation and the estimation of treatment effects in causal inference are two approaches to the same problem, and compare recent progress in these two areas.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2006.02579 [cs.LG]
	(or arXiv:2006.02579v1 [cs.LG] for this version)

Submission history

From: Tao Li [view email]
[v1] Wed, 3 Jun 2020 23:14:14 GMT (215kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.02579

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains

Submission history