We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.GT

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Science and Game Theory

Title: Coordination without communication: optimal regret in two players multi-armed bandits

Abstract: We consider two agents playing simultaneously the same stochastic three-armed bandit problem. The two agents are cooperating but they cannot communicate. We propose a strategy with no collisions at all between the players (with very high probability), and with near-optimal regret $O(\sqrt{T \log(T)})$. We also argue that the extra logarithmic term $\sqrt{\log(T)}$ should be necessary by proving a lower bound for a full information variant of the problem.
Comments: 28 pages, 5 figures. V2: minor revision
Subjects: Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG); Multiagent Systems (cs.MA); Machine Learning (stat.ML)
Journal reference: COLT 2020
Cite as: arXiv:2002.07596 [cs.GT]
  (or arXiv:2002.07596v2 [cs.GT] for this version)

Submission history

From: Thomas Budzinski [view email]
[v1] Fri, 14 Feb 2020 17:35:42 GMT (29kb)
[v2] Thu, 9 Jul 2020 19:11:02 GMT (29kb)

Link back to: arXiv, form interface, contact.