We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Coordinating Distributed Example Orders for Provably Accelerated Training

Abstract: Recent research on online Gradient Balancing (GraB) has revealed that there exist permutation-based example orderings for SGD that are guaranteed to outperform random reshuffling (RR). Whereas RR arbitrarily permutes training examples, GraB leverages stale gradients from prior epochs to order examples -- achieving a provably faster convergence rate than RR. However, GraB is limited by design: while it demonstrates an impressive ability to scale-up training on centralized data, it does not naturally extend to modern distributed ML workloads. We therefore propose Coordinated Distributed GraB (CD-GraB), which uses insights from prior work on kernel thinning to translate the benefits of provably faster permutation-based example ordering to distributed settings. With negligible overhead, CD-GraB exhibits a linear speedup in convergence rate over centralized GraB and outperforms distributed RR on a variety of benchmark tasks.
Comments: NeurIPS 2023
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)
Cite as: arXiv:2302.00845 [cs.LG]
  (or arXiv:2302.00845v5 [cs.LG] for this version)

Submission history

From: A. Feder Cooper [view email]
[v1] Thu, 2 Feb 2023 03:15:29 GMT (6743kb,D)
[v2] Mon, 6 Mar 2023 23:04:27 GMT (6744kb,D)
[v3] Mon, 29 May 2023 22:36:53 GMT (7836kb,D)
[v4] Wed, 6 Dec 2023 05:49:55 GMT (7883kb,D)
[v5] Thu, 21 Dec 2023 19:41:57 GMT (7883kb,D)

Link back to: arXiv, form interface, contact.