We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: Unbiased Self-Play

Authors: Shohei Ohsawa
Abstract: We present a general optimization framework for emergent belief-state representation without any supervision. We employed the common configuration of multiagent reinforcement learning and communication to improve exploration coverage over an environment by leveraging the knowledge of each agent. In this paper, we obtained that recurrent neural nets (RNNs) with shared weights are highly biased in partially observable environments because of their noncooperativity. To address this, we designated an unbiased version of self-play via mechanism design, also known as reverse game theory, to clarify unbiased knowledge at the Bayesian Nash equilibrium. The key idea is to add imaginary rewards using the peer prediction mechanism, i.e., a mechanism for mutually criticizing information in a decentralized environment. Numerical analyses, including StarCraft exploration tasks with up to 20 agents and off-the-shelf RNNs, demonstrate the state-of-the-art performance.
Comments: Several mathematical flaws found
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Econometrics (econ.EM)
Cite as: arXiv:2106.03007 [stat.ML]
  (or arXiv:2106.03007v2 [stat.ML] for this version)

Submission history

From: Shohei Ohsawa [view email]
[v1] Sun, 6 Jun 2021 02:16:45 GMT (6898kb,D)
[v2] Fri, 6 May 2022 22:16:06 GMT (0kb,I)

Link back to: arXiv, form interface, contact.