Improved Analysis of UCRL2 with Empirical Bernstein Inequality

Fruit, Ronan; Pirotta, Matteo; Lazaric, Alessandro

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2007

Computer Science > Machine Learning

Title: Improved Analysis of UCRL2 with Empirical Bernstein Inequality

Authors: Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

(Submitted on 10 Jul 2020)

Abstract: We consider the problem of exploration-exploitation in communicating Markov Decision Processes. We provide an analysis of UCRL2 with Empirical Bernstein inequalities (UCRL2B). For any MDP with $S$ states, $A$ actions, $\Gamma \leq S$ next states and diameter $D$, the regret of UCRL2B is bounded as $\widetilde{O}(\sqrt{D\Gamma S A T})$.

Comments:	Document in support of the tutorial at ALT 2019
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2007.05456 [cs.LG]
	(or arXiv:2007.05456v1 [cs.LG] for this version)

Submission history

From: Matteo Pirotta [view email]
[v1] Fri, 10 Jul 2020 15:52:21 GMT (19kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2007.05456

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Improved Analysis of UCRL2 with Empirical Bernstein Inequality

Submission history