Prosocial learning agents solve generalized Stag Hunts better than selfish ones

Peysakhovich, Alexander; Lerer, Adam

Full-text links:

Download:

Current browse context:

cs.AI

< prev | next >

new | recent | 1709

Computer Science > Artificial Intelligence

Title: Prosocial learning agents solve generalized Stag Hunts better than selfish ones

Authors: Alexander Peysakhovich, Adam Lerer

(Submitted on 8 Sep 2017 (v1), last revised 8 Dec 2017 (this version, v2))

Abstract: Deep reinforcement learning has become an important paradigm for constructing agents that can enter complex multi-agent situations and improve their policies through experience. One commonly used technique is reactive training - applying standard RL methods while treating other agents as a part of the learner's environment. It is known that in general-sum games reactive training can lead groups of agents to converge to inefficient outcomes. We focus on one such class of environments: Stag Hunt games. Here agents either choose a risky cooperative policy (which leads to high payoffs if both choose it but low payoffs to an agent who attempts it alone) or a safe one (which leads to a safe payoff no matter what). We ask how we can change the learning rule of a single agent to improve its outcomes in Stag Hunts that include other reactive learners. We extend existing work on reward-shaping in multi-agent reinforcement learning and show that that making a single agent prosocial, that is, making them care about the rewards of their partners can increase the probability that groups converge to good outcomes. Thus, even if we control a single agent in a group making that agent prosocial can increase our agent's long-run payoff. We show experimentally that this result carries over to a variety of more complex environments with Stag Hunt-like dynamics including ones where agents must learn from raw input pixels.

Subjects:	Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT)
Cite as:	arXiv:1709.02865 [cs.AI]
	(or arXiv:1709.02865v2 [cs.AI] for this version)

Submission history

From: Alexander Peysakhovich [view email]
[v1] Fri, 8 Sep 2017 21:52:58 GMT (350kb,D)
[v2] Fri, 8 Dec 2017 17:55:10 GMT (394kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1709.02865

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Artificial Intelligence

Title: Prosocial learning agents solve generalized Stag Hunts better than selfish ones

Submission history