Improved Regret for Zeroth-Order Adversarial Bandit Convex Optimisation

Lattimore, Tor

Full-text links:

Download:

Current browse context:

math.OC

< prev | next >

new | recent | 2006

Mathematics > Optimization and Control

Title: Improved Regret for Zeroth-Order Adversarial Bandit Convex Optimisation

Authors: Tor Lattimore

(Submitted on 31 May 2020 (v1), last revised 25 Sep 2020 (this version, v3))

Abstract: We prove that the information-theoretic upper bound on the minimax regret for zeroth-order adversarial bandit convex optimisation is at most $O(d^{2.5} \sqrt{n} \log(n))$, where $d$ is the dimension and $n$ is the number of interactions. This improves on $O(d^{9.5} \sqrt{n} \log(n)^{7.5}$ by Bubeck et al. (2017). The proof is based on identifying an improved exploratory distribution for convex functions.

Comments:	To appear in Mathematical Statistics and Learning. 22 pages, 6 figures
Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2006.00475 [math.OC]
	(or arXiv:2006.00475v3 [math.OC] for this version)

Submission history

From: Tor Lattimore [view email]
[v1] Sun, 31 May 2020 09:22:10 GMT (297kb,D)
[v2] Fri, 19 Jun 2020 13:04:49 GMT (397kb,D)
[v3] Fri, 25 Sep 2020 13:10:28 GMT (224kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> math > arXiv:2006.00475

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Mathematics > Optimization and Control

Title: Improved Regret for Zeroth-Order Adversarial Bandit Convex Optimisation

Submission history