We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Asynchronous Multi-Model Dynamic Federated Learning over Wireless Networks: Theory, Modeling, and Optimization

Abstract: Federated learning (FL) has emerged as a key technique for distributed machine learning (ML). Most literature on FL has focused on ML model training for (i) a single task/model, with (ii) a synchronous scheme for updating model parameters, and (iii) a static data distribution setting across devices, which is often not realistic in practical wireless environments. To address this, we develop DMA-FL considering dynamic FL with multiple downstream tasks/models over an asynchronous model update architecture. We first characterize convergence via introducing scheduling tensors and rectangular functions to capture the impact of system parameters on learning performance. Our analysis sheds light on the joint impact of device training variables (e.g., number of local gradient descent steps), asynchronous scheduling decisions (i.e., when a device trains a task), and dynamic data drifts on the performance of ML training for different tasks. Leveraging these results, we formulate an optimization for jointly configuring resource allocation and device scheduling to strike an efficient trade-off between energy consumption and ML performance. Our solver for the resulting non-convex mixed integer program employs constraint relaxations and successive convex approximations with convergence guarantees. Through numerical experiments, we reveal that DMA-FL substantially improves the performance-efficiency tradeoff.
Comments: Completed the major revision for IEEE Transactions on Cognitive Communications and Networking
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as: arXiv:2305.13503 [cs.LG]
  (or arXiv:2305.13503v3 [cs.LG] for this version)

Submission history

From: Zhan-Lun Chang [view email]
[v1] Mon, 22 May 2023 21:39:38 GMT (457kb,D)
[v2] Fri, 21 Jul 2023 00:15:28 GMT (651kb,D)
[v3] Thu, 15 Feb 2024 23:04:03 GMT (1114kb,D)

Link back to: arXiv, form interface, contact.