Constrained Upper Confidence Reinforcement Learning

Zheng, Liyuan; Ratliff, Lillian J.

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2001

Computer Science > Machine Learning

Title: Constrained Upper Confidence Reinforcement Learning

Authors: Liyuan Zheng, Lillian J. Ratliff

(Submitted on 26 Jan 2020)

Abstract: Constrained Markov Decision Processes are a class of stochastic decision problems in which the decision maker must select a policy that satisfies auxiliary cost constraints. This paper extends upper confidence reinforcement learning for settings in which the reward function and the constraints, described by cost functions, are unknown a priori but the transition kernel is known. Such a setting is well-motivated by a number of applications including exploration of unknown, potentially unsafe, environments. We present an algorithm C-UCRL and show that it achieves sub-linear regret ($ O(T^{\frac{3}{4}}\sqrt{\log(T/\delta)})$) with respect to the reward while satisfying the constraints even while learning with probability $1-\delta$. Illustrative examples are provided.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2001.09377 [cs.LG]
	(or arXiv:2001.09377v1 [cs.LG] for this version)

Submission history

From: Liyuan Zheng [view email]
[v1] Sun, 26 Jan 2020 00:23:02 GMT (1639kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2001.09377

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Constrained Upper Confidence Reinforcement Learning

Submission history