Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring

Tsuchiya, Taira; Honda, Junya; Sugiyama, Masashi

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 2006

Statistics > Machine Learning

Title: Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring

Authors: Taira Tsuchiya, Junya Honda, Masashi Sugiyama

(Submitted on 17 Jun 2020 (v1), last revised 10 Jun 2021 (this version, v2))

Abstract: We investigate finite stochastic partial monitoring, which is a general model for sequential learning with limited feedback. While Thompson sampling is one of the most promising algorithms on a variety of online decision-making problems, its properties for stochastic partial monitoring have not been theoretically investigated, and the existing algorithm relies on a heuristic approximation of the posterior distribution. To mitigate these problems, we present a novel Thompson-sampling-based algorithm, which enables us to exactly sample the target parameter from the posterior distribution. Besides, we prove that the new algorithm achieves the logarithmic problem-dependent expected pseudo-regret $\mathrm{O}(\log T)$ for a linearized variant of the problem with local observability. This result is the first regret bound of Thompson sampling for partial monitoring, which also becomes the first logarithmic regret bound of Thompson sampling for linear bandits.

Comments:	Published version in NeurIPS 2020 (this https URL), 39 pages, 4 figures
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2006.09668 [stat.ML]
	(or arXiv:2006.09668v2 [stat.ML] for this version)

Submission history

From: Taira Tsuchiya [view email]
[v1] Wed, 17 Jun 2020 05:48:33 GMT (900kb,D)
[v2] Thu, 10 Jun 2021 09:00:24 GMT (885kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2006.09668

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring

Submission history