We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CR

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: Leopard: Towards High Throughput-Preserving BFT for Large-scale Systems

Abstract: With the emergence of large-scale decentralized applications, a scalable and efficient Byzantine Fault Tolerant (BFT) protocol of hundreds of replicas is desirable. Although the throughput of existing leader-based BFT protocols has reached a high level of $10^5$ requests per second for a small scale of replicas, it drops significantly when the number of replicas increases, which leads to a lack of practicality. This paper focuses on the scalability of BFT protocols and identifies a major bottleneck to leader-based BFT protocols due to the excessive workload of the leader at large scales. A new metric of scaling factor is defined to capture whether a BFT protocol will get stuck when it scales out, which can be used to measure the performance of efficiency and scalability of BFT protocols. We propose "Leopard", the first leader-based BFT protocol that scales to multiple hundreds of replicas, and more importantly, preserves a high efficiency. We remove the bottleneck by introducing a technique of achieving a constant scaling factor, which takes full advantage of the idle resource and adaptively balances the workload of the leader among all replicas. We implement Leopard and evaluate its performance compared to HotStuff, the state-of-the-art BFT protocol. We run extensive experiments on the two systems with up to 600 replicas. The results show that Leopard achieves significant performance improvements both on throughput and scalability. In particular, the throughput of Leopard remains at a high level of $10^5$ when the system scales out to 600 replicas. It achieves a $5\times$ throughput over HotStuff when the scale is 300 (which is already the largest scale we can see the progress of the latter in our experiments), and the gap becomes wider as the number of replicas further increases.
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)
Cite as: arXiv:2106.08114 [cs.DC]
  (or arXiv:2106.08114v4 [cs.DC] for this version)

Submission history

From: Kexin Hu [view email]
[v1] Tue, 15 Jun 2021 13:16:48 GMT (4615kb,D)
[v2] Wed, 16 Jun 2021 03:46:43 GMT (4576kb,D)
[v3] Thu, 29 Jul 2021 07:42:55 GMT (5323kb,D)
[v4] Sun, 6 Feb 2022 14:28:15 GMT (6749kb,D)

Link back to: arXiv, form interface, contact.