Error Compensated Distributed SGD Can Be Accelerated

Qian, Xun; Richtárik, Peter; Zhang, Tong

Full-text links:

Download:

Current browse context:

math.OC

< prev | next >

new | recent | 2010

Mathematics > Optimization and Control

Title: Error Compensated Distributed SGD Can Be Accelerated

Authors: Xun Qian, Peter Richtárik, Tong Zhang

(Submitted on 30 Sep 2020)

Abstract: Gradient compression is a recent and increasingly popular technique for reducing the communication cost in distributed training of large-scale machine learning models. In this work we focus on developing efficient distributed methods that can work for any compressor satisfying a certain contraction property, which includes both unbiased (after appropriate scaling) and biased compressors such as RandK and TopK. Applied naively, gradient compression introduces errors that either slow down convergence or lead to divergence. A popular technique designed to tackle this issue is error compensation/error feedback. Due to the difficulties associated with analyzing biased compressors, it is not known whether gradient compression with error compensation can be combined with Nesterov's acceleration. In this work, we show for the first time that error compensated gradient compression methods can be accelerated. In particular, we propose and study the error compensated loopless Katyusha method, and establish an accelerated linear convergence rate under standard assumptions. We show through numerical experiments that the proposed method converges with substantially fewer communication rounds than previous error compensated algorithms.

Comments:	24 pages, 6 figures
Subjects:	Optimization and Control (math.OC)
Cite as:	arXiv:2010.00091 [math.OC]
	(or arXiv:2010.00091v1 [math.OC] for this version)

Submission history

From: Xun Qian [view email]
[v1] Wed, 30 Sep 2020 20:09:31 GMT (137kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> math > arXiv:2010.00091

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Mathematics > Optimization and Control

Title: Error Compensated Distributed SGD Can Be Accelerated

Submission history