Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret
(Submitted on 25 Feb 2017 (v1), last revised 17 Jan 2018 (this version, v3))
Abstract: We present an efficient second-order algorithm with $\tilde{O}(\frac{1}{\eta}\sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, for a range of $\eta$ restricted by the norm of the competitor. The family of loss functions ranges from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (J. Abernethy and A. Rakhlin. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it also performs favorably against earlier algorithms.
Submission history
From: Chicheng Zhang [view email][v1] Sat, 25 Feb 2017 23:15:55 GMT (889kb,D)
[v2] Tue, 13 Jun 2017 06:06:03 GMT (899kb,D)
[v3] Wed, 17 Jan 2018 19:22:21 GMT (1205kb,D)
Link back to: arXiv, form interface, contact.