We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DC

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Distributed, Parallel, and Cluster Computing

Title: Communication Lower Bound in Convolution Accelerators

Abstract: In current convolutional neural network (CNN) accelerators, communication (i.e., memory access) dominates the energy consumption. This work provides comprehensive analysis and methodologies to minimize the communication for CNN accelerators. For the off-chip communication, we derive the theoretical lower bound for any convolutional layer and propose a dataflow to reach the lower bound. This fundamental problem has never been solved by prior studies. The on-chip communication is minimized based on an elaborate workload and storage mapping scheme. We in addition design a communication-optimal CNN accelerator architecture. Evaluations based on the 65nm technology demonstrate that the proposed architecture nearly reaches the theoretical minimum communication in a three-level memory hierarchy and it is computation dominant. The gap between the energy efficiency of our accelerator and the theoretical best value is only 37-87%.
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR)
Cite as: arXiv:1911.05662 [cs.DC]
  (or arXiv:1911.05662v2 [cs.DC] for this version)

Submission history

From: Xiaoming Chen [view email]
[v1] Fri, 8 Nov 2019 04:54:17 GMT (1173kb,D)
[v2] Thu, 14 Nov 2019 02:08:40 GMT (1248kb,D)

Link back to: arXiv, form interface, contact.