Problem Dependent View on Structured Thresholding Bandit Problems

Cheshire, James; Ménard, Pierre; Carpentier, Alexandra

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 2106

Statistics > Machine Learning

Title: Problem Dependent View on Structured Thresholding Bandit Problems

Authors: James Cheshire, Pierre Ménard, Alexandra Carpentier

(Submitted on 18 Jun 2021)

Abstract: We investigate the problem dependent regime in the stochastic Thresholding Bandit problem (TBP) under several shape constraints. In the TBP, the objective of the learner is to output, at the end of a sequential game, the set of arms whose means are above a given threshold. The vanilla, unstructured, case is already well studied in the literature. Taking $K$ as the number of arms, we consider the case where (i) the sequence of arm's means $(\mu_k)_{k=1}^K$ is monotonically increasing (MTBP) and (ii) the case where $(\mu_k)_{k=1}^K$ is concave (CTBP). We consider both cases in the problem dependent regime and study the probability of error - i.e. the probability to mis-classify at least one arm. In the fixed budget setting, we provide upper and lower bounds for the probability of error in both the concave and monotone settings, as well as associated algorithms. In both settings the bounds match in the problem dependent regime up to universal constants in the exponential.

Comments:	25 pages. arXiv admin note: text overlap with arXiv:2006.10006
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2106.10166 [stat.ML]
	(or arXiv:2106.10166v1 [stat.ML] for this version)

Submission history

From: Pierre Menard [view email]
[v1] Fri, 18 Jun 2021 15:01:01 GMT (1632kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2106.10166

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Problem Dependent View on Structured Thresholding Bandit Problems

Submission history