References & Citations
Computer Science > Distributed, Parallel, and Cluster Computing
Title: Communication Lower Bound in Convolution Accelerators
(Submitted on 8 Nov 2019 (this version), latest version 17 Jan 2020 (v3))
Abstract: In current convolutional neural network (CNN) accelerators, communication (i.e., memory access) dominates the energy consumption. This work provides comprehensive analysis and methodologies to minimize the communication for CNN accelerators. For the off-chip communication, we derive the theoretical lower bound for any convolutional layer and propose a dataflow to reach the lower bound. This fundamental problem has never been solved by prior studies. The on-chip communication is minimized based on an elaborate workload and storage mapping scheme. We in addition design a communication-optimal CNN accelerator architecture. Evaluations based on the 65nm technology demonstrate that the proposed architecture nearly reaches the theoretical minimum communication in a three-level memory hierarchy and it is computation dominant. The gap between the energy efficiency of our accelerator and the theoretical best value is only 37-87%.
Submission history
From: Xiaoming Chen [view email][v1] Fri, 8 Nov 2019 04:54:17 GMT (1173kb,D)
[v2] Thu, 14 Nov 2019 02:08:40 GMT (1248kb,D)
[v3] Fri, 17 Jan 2020 03:04:10 GMT (1238kb,D)
Link back to: arXiv, form interface, contact.