Current browse context:
math.OC
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: CD-GraB: Coordinating Distributed Example Orders for Provably Accelerated Training
(Submitted on 2 Feb 2023 (v1), last revised 29 May 2023 (this version, v3))
Abstract: Recent research on online Gradient Balancing (GraB) has revealed that there exist permutation-based example orderings that are guaranteed to outperform random reshuffling (RR). Whereas RR arbitrarily permutes training examples, GraB leverages stale gradients from prior epochs to order examples -- achieving a provably faster convergence rate than RR. However, GraB is limited by design: While it demonstrates an impressive ability to scale-up training on centralized data, it does not naturally extend to modern distributed ML workloads. We therefore propose Coordinated Distributed GraB (CD-GraB), which uses insights from prior work on kernel thinning to translate the benefits of provably faster permutation-based example ordering to distributed settings. With negligible overhead, CD-GraB exhibits a linear speedup in convergence rate over centralized GraB and outperforms baselines empirically, including distributed RR, on a variety of benchmark tasks.
Submission history
From: A. Feder Cooper [view email][v1] Thu, 2 Feb 2023 03:15:29 GMT (6743kb,D)
[v2] Mon, 6 Mar 2023 23:04:27 GMT (6744kb,D)
[v3] Mon, 29 May 2023 22:36:53 GMT (7836kb,D)
Link back to: arXiv, form interface, contact.