We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Integer-Only Neural Network Quantization Scheme Based on Shift-Batch-Normalization

Abstract: Neural networks are very popular in many areas, but great computing complexity makes it hard to run neural networks on devices with limited resources. To address this problem, quantization methods are used to reduce model size and computation cost, making it possible to use neural networks on embedded platforms or mobile devices.
In this paper, an integer-only-quantization scheme is introduced. This scheme uses one layer that combines shift-based batch normalization and uniform quantization to implement 4-bit integer-only inference. Without big integer multiplication(which is used in previous integer-only-quantization methods), this scheme can achieve good power and latency efficiency, and is especially suitable to be deployed on co-designed hardware platforms. Tests have proved that this scheme works very well for easy tasks. And for tough tasks, performance loss can be tolerated for its inference efficiency. Our work is available on github: this https URL
Subjects: Machine Learning (cs.LG)
Cite as: arXiv:2106.00127 [cs.LG]
  (or arXiv:2106.00127v1 [cs.LG] for this version)

Submission history

From: Guo Qingyu [view email]
[v1] Fri, 28 May 2021 09:28:12 GMT (261kb,D)

Link back to: arXiv, form interface, contact.