We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications

Abstract: We study combinatorial multi-armed bandit with probabilistically triggered arms (CMAB-T) and semi-bandit feedback. We resolve a serious issue in the prior CMAB-T studies where the regret bounds contain a possibly exponentially large factor of $1/p^*$, where $p^*$ is the minimum positive probability that an arm is triggered by any action. We address this issue by introducing a triggering probability modulated (TPM) bounded smoothness condition into the general CMAB-T framework, and show that many applications such as influence maximization bandit and combinatorial cascading bandit satisfy this TPM condition. As a result, we completely remove the factor of $1/p^*$ from the regret bounds, achieving significantly better regret bounds for influence maximization and cascading bandits than before. Finally, we provide lower bound results showing that the factor $1/p^*$ is unavoidable for general CMAB-T problems, suggesting that the TPM condition is crucial in removing this factor.
Comments: This is the full version of the paper accepted at NIPS'2017
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:1703.01610 [cs.LG]
  (or arXiv:1703.01610v5 [cs.LG] for this version)

Submission history

From: Wei Chen [view email]
[v1] Sun, 5 Mar 2017 15:31:35 GMT (72kb)
[v2] Thu, 12 Oct 2017 08:25:41 GMT (70kb)
[v3] Sun, 5 Nov 2017 05:50:04 GMT (72kb)
[v4] Wed, 21 Feb 2018 19:21:09 GMT (74kb)
[v5] Tue, 8 Jun 2021 07:55:43 GMT (77kb)

Link back to: arXiv, form interface, contact.