We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

Abstract: As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-parallel SGD is QSGD (Alistarh et al., 2017), which quantizes and encodes gradients to reduce communication costs. The baseline variant of QSGD provides strong theoretical guarantees, however, for practical purposes, the authors proposed a heuristic variant which we call QSGDinf, which demonstrated impressive empirical gains for distributed training of large neural networks. In this paper, we build on this work to propose a new gradient quantization scheme, and show that it has both stronger theoretical guarantees than QSGD, and matches and exceeds the empirical performance of the QSGDinf heuristic and of other compression methods.
Comments: 42 pages, 21 figures. To appear in the Journal of Machine Learning Research (JMLR)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:1908.06077 [cs.LG]
  (or arXiv:1908.06077v2 [cs.LG] for this version)

Submission history

From: Ali Ramezani-Kebrya [view email]
[v1] Fri, 16 Aug 2019 17:59:01 GMT (8175kb,D)
[v2] Mon, 3 May 2021 21:39:42 GMT (10253kb,D)

Link back to: arXiv, form interface, contact.