Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Regret Analysis of the Anytime Optimally Confident UCB Algorithm
(Submitted on 29 Mar 2016 (v1), last revised 6 May 2016 (this version, v2))
Abstract: I introduce and analyse an anytime version of the Optimally Confident UCB (OCUCB) algorithm designed for minimising the cumulative regret in finite-armed stochastic bandits with subgaussian noise. The new algorithm is simple, intuitive (in hindsight) and comes with the strongest finite-time regret guarantees for a horizon-free algorithm so far. I also show a finite-time lower bound that nearly matches the upper bound.
Submission history
From: Tor Lattimore [view email][v1] Tue, 29 Mar 2016 07:12:14 GMT (34kb)
[v2] Fri, 6 May 2016 19:06:26 GMT (21kb)
Link back to: arXiv, form interface, contact.