References & Citations
Mathematics > Statistics Theory
Title: Batched bandit problems
(Submitted on 2 May 2015 (v1), last revised 29 Mar 2016 (this version, v3))
Abstract: Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.
Submission history
From: Vianney Perchet [view email][v1] Sat, 2 May 2015 20:22:00 GMT (79kb)
[v2] Wed, 8 Jul 2015 13:55:27 GMT (79kb)
[v3] Tue, 29 Mar 2016 08:00:15 GMT (211kb)
Link back to: arXiv, form interface, contact.