Generative Design of Hardware-aware DNNs

Kao, Sheng-Chun; Ramamurthy, Arun; Krishna, Tushar

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2006

Computer Science > Machine Learning

Title: Generative Design of Hardware-aware DNNs

Authors: Sheng-Chun Kao, Arun Ramamurthy, Tushar Krishna

(Submitted on 6 Jun 2020 (v1), last revised 12 Jul 2020 (this version, v2))

Abstract: To efficiently run DNNs on the edge/cloud, many new DNN inference accelerators are being designed and deployed frequently. To enhance the resource efficiency of DNNs, model quantization is a widely-used approach. However, different accelerator/HW has different resources leading to the need for specialized quantization strategy of each HW. Moreover, using the same quantization for every layer may be sub-optimal, increasing the designspace of possible quantization choices. This makes manual-tuning infeasible. Recent work in automatically determining quantization for each layer is driven by optimization methods such as reinforcement learning. However, these approaches need re-training the RL for every new HW platform. We propose a new way for autonomous quantization and HW-aware tuning. We propose a generative model, AQGAN, which takes a target accuracy as the condition and generates a suite of quantization configurations. With the conditional generative model, the user can autonomously generate different configurations with different targets in inference time. Moreover, we propose a simplified HW-tuning flow, which uses the generative model to generate proposals and execute simple selection based on the HW resource budget, whose process is fast and interactive. We evaluate our model on five of the widely-used efficient models on the ImageNet dataset. We compare with existing uniform quantization and state-of-the-art autonomous quantization methods. Our generative model shows competitive achieved accuracy, however, with around two degrees less search cost for each design point. Our generative model shows the generated quantization configuration can lead to less than 3.5% error across all experiments.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2006.03968 [cs.LG]
	(or arXiv:2006.03968v2 [cs.LG] for this version)

Submission history

From: Sheng-Chun Kao [view email]
[v1] Sat, 6 Jun 2020 20:39:25 GMT (835kb,D)
[v2] Sun, 12 Jul 2020 23:30:23 GMT (835kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.03968v2

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Generative Design of Hardware-aware DNNs

Submission history