Thresholding Bandit with Optimal Aggregate Regret

Tao, Chao; Blanco, Saùl; Peng, Jian; Zhou, Yuan

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1905

Computer Science > Machine Learning

Title: Thresholding Bandit with Optimal Aggregate Regret

Authors: Chao Tao, Saùl Blanco, Jian Peng, Yuan Zhou

(Submitted on 27 May 2019)

Abstract: We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1905.11046 [cs.LG]
	(or arXiv:1905.11046v1 [cs.LG] for this version)

Submission history

From: Yuan Zhou [view email]
[v1] Mon, 27 May 2019 08:51:26 GMT (655kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1905.11046

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Thresholding Bandit with Optimal Aggregate Regret

Submission history