We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

Abstract: In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions. It then predicts whether the mean of each distribution is larger or lower than a given threshold. We introduce a large family of algorithms (containing most existing relevant ones), inspired by the Frank-Wolfe algorithm, and provide a thorough yet generic analysis of their performance. This allowed us to construct new explicit algorithms, for a broad class of problems, whose losses are within a small constant factor of the non-adaptive oracle ones. Quite interestingly, we observed that adaptive methods empirically greatly out-perform non-adaptive oracles, an uncommon behavior in standard online learning settings, such as regret minimization. We explain this surprising phenomenon on an insightful toy problem.
Comments: 10+15 pages. To be published in the proceedings of NeurIPS 2021
Subjects: Machine Learning (cs.LG)
Cite as: arXiv:2110.09133 [cs.LG]
  (or arXiv:2110.09133v1 [cs.LG] for this version)

Submission history

From: Reda Ouhamma [view email]
[v1] Mon, 18 Oct 2021 09:36:36 GMT (273kb,D)

Link back to: arXiv, form interface, contact.