We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: Dynamic Batch Learning in High-Dimensional Sparse Linear Contextual Bandits

Abstract: We study the problem of dynamic batch learning in high-dimensional sparse linear contextual bandits, where a decision maker can only adapt decisions at a batch level. In particular, the decision maker, only observing rewards at the end of each batch, dynamically decides how many individuals to include in the next batch (at the current batch's end) and what personalized action-selection scheme to adopt within the batch. Such batch constraints are ubiquitous in a variety of practical contexts, including personalized product offerings in marketing and medical treatment selection in clinical trials. We characterize the fundamental learning limit in this problem via a novel lower bound analysis and provide a simple, exploration-free algorithm that uses the LASSO estimator, which achieves the minimax optimal performance characterized by the lower bound (up to log factors). To our best knowledge, our work provides the first inroad into a rigorous understanding of dynamic batch learning with high-dimensional covariates. We also demonstrate the efficacy of our algorithm on both synthetic data and the Warfarin medical dosing data. The empirical results show that with three batches (hence only two opportunities to adapt), our algorithm already performs comparably (in terms of statistical performance) to the state-of-the-art fully online high-dimensional linear contextual bandits algorithm. As an added bonus, since our algorithm operates in batches, it is orders of magnitudes faster than fully online learning algorithms. As such, our algorithm provides a desirable candidate for practical data-driven personalized decision making problems, where limited adaptivity is often a hard constraint.
Comments: 36 pages, 2 figures
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG)
Cite as: arXiv:2008.11918 [stat.ML]
  (or arXiv:2008.11918v3 [stat.ML] for this version)

Submission history

From: Zhimei Ren [view email]
[v1] Thu, 27 Aug 2020 05:34:34 GMT (156kb)
[v2] Fri, 28 Aug 2020 01:03:55 GMT (156kb)
[v3] Wed, 30 Jun 2021 19:40:59 GMT (1240kb,D)

Link back to: arXiv, form interface, contact.