We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Distributed Optimization using Heterogeneous Compute Systems

Authors: Vineeth S
Abstract: Hardware compute power has been growing at an unprecedented rate in recent years. The utilization of such advancements plays a key role in producing better results in less time -- both in academia and industry. However, merging the existing hardware with the latest hardware within the same ecosystem poses a challenging task. One of the key challenges, in this case, is varying compute power. In this paper, we consider the training of deep neural networks on a distributed system of workers with varying compute power. A naive implementation of synchronous distributed training will result in the faster workers waiting for the slowest worker to complete processing. To mitigate this issue, we propose to dynamically adjust the data assigned for each worker during the training. We assign each worker a partition of total data proportional to its computing power. Our experiments show that dynamically adjusting the data partition helps to improve the utilization of the system and significantly reduces the time taken for training. Code is available at the repository: \url{this https URL}.
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as: arXiv:2110.08941 [cs.LG]
  (or arXiv:2110.08941v1 [cs.LG] for this version)

Submission history

From: Vineeth S [view email]
[v1] Sun, 3 Oct 2021 11:21:49 GMT (1066kb,D)

Link back to: arXiv, form interface, contact.