We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Hardware Architecture

Title: Customizing Number Representation and Precision

Authors: Olivier Sentieys (TARAN), Daniel Menard (INSA Rennes)
Abstract: There is a growing interest in the use of reduced-precision arithmetic, exacerbated by the recent interest in artificial intelligence, especially with deep learning. Most architectures already provide reduced-precision capabilities (e.g., 8-bit integer, 16-bit floating point). In the context of FPGAs, any number format and bit-width can even be considered.In computer arithmetic, the representation of real numbers is a major issue. Fixed-point (FxP) and floating-point (FlP) are the main options to represent reals, both with their advantages and drawbacks. This chapter presents both FxP and FlP number representations, and draws a fair a comparison between their cost, performance and energy, as well as their impact on accuracy during computations.It is shown that the choice between FxP and FlP is not obvious and strongly depends on the application considered. In some cases, low-precision floating-point arithmetic can be the most effective and provides some benefits over the classical fixed-point choice for energy-constrained applications.
Comments: In press
Subjects: Hardware Architecture (cs.AR); Machine Learning (cs.LG); Signal Processing (eess.SP)
Cite as: arXiv:2212.04184 [cs.AR]
  (or arXiv:2212.04184v1 [cs.AR] for this version)

Submission history

From: Hal Ccsd [view email] [via HAL proxy]
[v1] Thu, 8 Dec 2022 10:54:21 GMT (2173kb,D)

Link back to: arXiv, form interface, contact.