Customizing Number Representation and Precision

Sentieys, Olivier; Menard, Daniel

Full-text links:

Download:

Current browse context:

cs.AR

< prev | next >

new | recent | 2212

Computer Science > Hardware Architecture

Title: Customizing Number Representation and Precision

Authors: Olivier Sentieys (TARAN), Daniel Menard (INSA Rennes)

(Submitted on 8 Dec 2022)

Abstract: There is a growing interest in the use of reduced-precision arithmetic, exacerbated by the recent interest in artificial intelligence, especially with deep learning. Most architectures already provide reduced-precision capabilities (e.g., 8-bit integer, 16-bit floating point). In the context of FPGAs, any number format and bit-width can even be considered.In computer arithmetic, the representation of real numbers is a major issue. Fixed-point (FxP) and floating-point (FlP) are the main options to represent reals, both with their advantages and drawbacks. This chapter presents both FxP and FlP number representations, and draws a fair a comparison between their cost, performance and energy, as well as their impact on accuracy during computations.It is shown that the choice between FxP and FlP is not obvious and strongly depends on the application considered. In some cases, low-precision floating-point arithmetic can be the most effective and provides some benefits over the classical fixed-point choice for energy-constrained applications.

Comments:	In press
Subjects:	Hardware Architecture (cs.AR); Machine Learning (cs.LG); Signal Processing (eess.SP)
Cite as:	arXiv:2212.04184 [cs.AR]
	(or arXiv:2212.04184v1 [cs.AR] for this version)

Submission history

From: Hal Ccsd [view email] [via HAL proxy]
[v1] Thu, 8 Dec 2022 10:54:21 GMT (2173kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2212.04184

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Hardware Architecture

Title: Customizing Number Representation and Precision

Submission history