SquirRL: Automating Attack Analysis on Blockchain Incentive Mechanisms with Deep Reinforcement Learning

Hou, Charlie; Zhou, Mingxun; Ji, Yan; Daian, Phil; Tramer, Florian; Fanti, Giulia; Juels, Ari

Full-text links:

Download:

Current browse context:

cs.CR

< prev | next >

new | recent | 1912

Change to browse by:

Computer Science > Cryptography and Security

Title: SquirRL: Automating Attack Analysis on Blockchain Incentive Mechanisms with Deep Reinforcement Learning

Authors: Charlie Hou, Mingxun Zhou, Yan Ji, Phil Daian, Florian Tramer, Giulia Fanti, Ari Juels

(Submitted on 4 Dec 2019 (v1), last revised 4 Aug 2020 (this version, v2))

Abstract: Incentive mechanisms are central to the functionality of permissionless blockchains: they incentivize participants to run and secure the underlying consensus protocol. Designing incentive-compatible incentive mechanisms is notoriously challenging, however. As a result, most public blockchains today use incentive mechanisms whose security properties are poorly understood and largely untested. In this work, we propose SquirRL, a framework for using deep reinforcement learning to analyze attacks on blockchain incentive mechanisms. We demonstrate SquirRL's power by first recovering known attacks: (1) the optimal selfish mining attack in Bitcoin [52], and (2) the Nash equilibrium in block withholding attacks [16]. We also use SquirRL to obtain several novel empirical results. First, we discover a counterintuitive flaw in the widely used rushing adversary model when applied to multi-agent Markov games with incomplete information. Second, we demonstrate that the optimal selfish mining strategy identified in [52] is actually not a Nash equilibrium in the multi-agent selfish mining setting. In fact, our results suggest (but do not prove) that when more than two competing agents engage in selfish mining, there is no profitable Nash equilibrium. This is consistent with the lack of observed selfish mining in the wild. Third, we find a novel attack on a simplified version of Ethereum's finalization mechanism, Casper the Friendly Finality Gadget (FFG) that allows a strategic agent to amplify her rewards by up to 30%. Notably, [10] show that honest voting is a Nash equilibrium in Casper FFG: our attack shows that when Casper FFG is composed with selfish mining, this is no longer the case. Altogether, our experiments demonstrate SquirRL's flexibility and promise as a framework for studying attack settings that have thus far eluded theoretical and empirical understanding.

Subjects:	Cryptography and Security (cs.CR)
Cite as:	arXiv:1912.01798 [cs.CR]
	(or arXiv:1912.01798v2 [cs.CR] for this version)

Submission history

From: Charlie Hou [view email]
[v1] Wed, 4 Dec 2019 04:48:21 GMT (1575kb,D)
[v2] Tue, 4 Aug 2020 19:02:57 GMT (2100kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1912.01798

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Cryptography and Security

Title: SquirRL: Automating Attack Analysis on Blockchain Incentive Mechanisms with Deep Reinforcement Learning

Submission history