We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Non-Stationary Bandits with Knapsack Problems with Advice

Abstract: We consider a non-stationary Bandits with Knapsack problem. The outcome distribution at each time is scaled by a non-stationary quantity that signifies changing demand volumes. Instead of studying settings with limited non-stationarity, we investigate how online predictions on the total demand volume $Q$ allows us to improve our performance guarantees. We show that, without any prediction, any online algorithm incurs a linear-in-$T$ regret. In contrast, with online predictions on $Q$, we propose an online algorithm that judiciously incorporates the predictions, and achieve regret bounds that depends on the accuracy of the predictions. These bounds are shown to be tight in settings when prediction accuracy improves across time. Our theoretical results are corroborated by our numerical findings.
Comments: 33 pages, 4 figures
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as: arXiv:2302.04182 [cs.LG]
  (or arXiv:2302.04182v1 [cs.LG] for this version)

Submission history

From: Wang Chi Cheung [view email]
[v1] Wed, 8 Feb 2023 16:40:43 GMT (99kb,D)

Link back to: arXiv, form interface, contact.