We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DC

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: Flare: Flexible In-Network Allreduce

Abstract: The allreduce operation is one of the most commonly used communication routines in distributed applications. To improve its bandwidth and to reduce network traffic, this operation can be accelerated by offloading it to network switches, that aggregate the data received from the hosts, and send them back the aggregated result. However, existing solutions provide limited customization opportunities and might provide suboptimal performance when dealing with custom operators and data types, with sparse data, or when reproducibility of the aggregation is a concern. To deal with these problems, in this work we design a flexible programmable switch by using as a building block PsPIN, a RISC-V architecture implementing the sPIN programming model. We then design, model, and analyze different algorithms for executing the aggregation on this architecture, showing performance improvements compared to state-of-the-art approaches.
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Networking and Internet Architecture (cs.NI)
ACM classes: C.2.4; C.2.1; B.4.3
Journal reference: Published in Proceedings of The International Conference for High Performance Computing Networking, Storage, and Analysis (SC '21) (2021)
DOI: 10.1145/3458817.3476178
Cite as: arXiv:2106.15565 [cs.DC]
  (or arXiv:2106.15565v1 [cs.DC] for this version)

Submission history

From: Daniele De Sensi PhD [view email]
[v1] Tue, 29 Jun 2021 16:58:32 GMT (2379kb,D)

Link back to: arXiv, form interface, contact.