An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

Hong, Kihyuk; Li, Yuhang; Tewari, Ambuj

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 2205

Statistics > Machine Learning

Title: An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

Authors: Kihyuk Hong, Yuhang Li, Ambuj Tewari

(Submitted on 29 May 2022 (v1), last revised 20 Feb 2023 (this version, v3))

Abstract: We propose an algorithm for non-stationary kernel bandits that does not require prior knowledge of the degree of non-stationarity. The algorithm follows randomized strategies obtained by solving optimization problems that balance exploration and exploitation. It adapts to non-stationarity by restarting when a change in the reward function is detected. Our algorithm enjoys a tighter dynamic regret bound than previous work on the non-stationary kernel bandit setting. Moreover, when applied to the non-stationary linear bandit setting by using a linear kernel, our algorithm is nearly minimax optimal, solving an open problem in the non-stationary linear bandit literature. We extend our algorithm to use a neural network for dynamically adapting the feature mapping to observed data. We prove a dynamic regret bound of the extension using the neural tangent kernel theory. We demonstrate empirically that our algorithm and the extension can adapt to varying degrees of non-stationarity.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2205.14775 [stat.ML]
	(or arXiv:2205.14775v3 [stat.ML] for this version)

Submission history

From: Kihyuk Hong [view email]
[v1] Sun, 29 May 2022 21:32:53 GMT (345kb,D)
[v2] Tue, 5 Jul 2022 15:21:43 GMT (347kb,D)
[v3] Mon, 20 Feb 2023 02:00:24 GMT (700kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2205.14775

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

Submission history