Constrained episodic reinforcement learning in concave-convex and knapsack settings

Brantley, Kianté; Dudik, Miroslav; Lykouris, Thodoris; Miryoosefi, Sobhan; Simchowitz, Max; Slivkins, Aleksandrs; Sun, Wen

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2006

Computer Science > Machine Learning

Title: Constrained episodic reinforcement learning in concave-convex and knapsack settings

Authors: Kianté Brantley, Miroslav Dudik, Thodoris Lykouris, Sobhan Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

(Submitted on 9 Jun 2020 (v1), last revised 6 Jun 2021 (this version, v2))

Abstract: We propose an algorithm for tabular episodic reinforcement learning with constraints. We provide a modular analysis with strong theoretical guarantees for settings with concave rewards and convex constraints, and for settings with hard constraints (knapsacks). Most of the previous work in constrained reinforcement learning is limited to linear constraints, and the remaining work focuses on either the feasibility question or settings with a single episode. Our experiments demonstrate that the proposed algorithm significantly outperforms these approaches in existing constrained episodic environments.

Comments:	The NeurIPS 2020 version of this paper includes a small bug, leading to an incorrect dependence on H in Theorem 3.4. This version fixes it by adjusting Eq. (9), Theorem 3.4 and the relevant proofs. Changes in the main text are noted in red. Changes in the appendix are limited to Appendices B.1, B.5, and B.6 and the statement of Lemma F.3
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
Cite as:	arXiv:2006.05051 [cs.LG]
	(or arXiv:2006.05051v2 [cs.LG] for this version)

Submission history

From: Thodoris Lykouris [view email]
[v1] Tue, 9 Jun 2020 05:02:44 GMT (1302kb,D)
[v2] Sun, 6 Jun 2021 03:30:29 GMT (2697kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.05051

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Constrained episodic reinforcement learning in concave-convex and knapsack settings

Submission history