Near-optimal Optimistic Reinforcement Learning using Empirical Bernstein Inequalities

Tossou, Aristide; Basu, Debabrota; Dimitrakakis, Christos

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1905

Computer Science > Machine Learning

Title: Near-optimal Optimistic Reinforcement Learning using Empirical Bernstein Inequalities

Authors: Aristide Tossou, Debabrota Basu, Christos Dimitrakakis

(Submitted on 27 May 2019 (v1), last revised 11 Dec 2019 (this version, v2))

Abstract: We study model-based reinforcement learning in an unknown finite communicating Markov decision process. We propose a simple algorithm that leverages a variance based confidence interval. We show that the proposed algorithm, UCRL-V, achieves the optimal regret $\tilde{\mathcal{O}}(\sqrt{DSAT})$ up to logarithmic factors, and so our work closes a gap with the lower bound without additional assumptions on the MDP. We perform experiments in a variety of environments that validates the theoretical bounds as well as prove UCRL-V to be better than the state-of-the-art algorithms.

Comments:	the algorithm has been simplified (no need to look at lower bound of the reward and transitions). Proof has been significantly clean-up. The previous "assumption" is clarified as a condition of the algorithm well-known as sub-modularity. The proof that the bounds satisfy the submodularity is clean-up
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Machine Learning (stat.ML)
Cite as:	arXiv:1905.12425 [cs.LG]
	(or arXiv:1905.12425v2 [cs.LG] for this version)

Submission history

From: Aristide Charles Yedia Tossou [view email]
[v1] Mon, 27 May 2019 20:15:54 GMT (3403kb,D)
[v2] Wed, 11 Dec 2019 18:01:23 GMT (6651kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1905.12425

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Near-optimal Optimistic Reinforcement Learning using Empirical Bernstein Inequalities

Submission history