We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Artificial Intelligence

Title: Online Double Oracle

Abstract: Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence. This paper proposes new learning algorithms for solving two-player zero-sum normal-form games where the number of pure strategies is prohibitively large. Specifically, we combine no-regret analysis from online learning with Double Oracle (DO) methods from game theory. Our method -- \emph{Online Double Oracle (ODO)} -- is provably convergent to a Nash equilibrium (NE). Most importantly, unlike normal DO methods, ODO is \emph{rationale} in the sense that each agent in ODO can exploit strategic adversary with a regret bound of $\mathcal{O}(\sqrt{T k \log(k)})$ where $k$ is not the total number of pure strategies, but rather the size of \emph{effective strategy set} that is linearly dependent on the support size of the NE. On tens of different real-world games, ODO outperforms DO, PSRO methods, and no-regret algorithms such as Multiplicative Weight Update by a significant margin, both in terms of convergence rate to a NE and average payoff against strategic adversaries.
Comments: Accepted at Transactions on Machine Learning Research (TMLR)
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT)
Journal reference: Transactions on Machine Learning Research 2022
Cite as: arXiv:2103.07780 [cs.AI]
  (or arXiv:2103.07780v5 [cs.AI] for this version)

Submission history

From: Le Cong Dinh [view email]
[v1] Sat, 13 Mar 2021 19:48:27 GMT (24837kb,D)
[v2] Tue, 16 Mar 2021 14:34:47 GMT (24838kb,D)
[v3] Fri, 4 Jun 2021 22:50:56 GMT (23668kb,D)
[v4] Mon, 16 May 2022 16:43:15 GMT (23749kb,D)
[v5] Wed, 15 Feb 2023 09:58:59 GMT (23749kb,D)

Link back to: arXiv, form interface, contact.