Decoupled Federated Learning for ASR with Non-IID Data

Zhu, Han; Wang, Jindong; Cheng, Gaofeng; Zhang, Pengyuan; Yan, Yonghong

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2206

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Decoupled Federated Learning for ASR with Non-IID Data

Authors: Han Zhu, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

(Submitted on 18 Jun 2022)

Abstract: Automatic speech recognition (ASR) with federated learning (FL) makes it possible to leverage data from multiple clients without compromising privacy. The quality of FL-based ASR could be measured by recognition performance, communication and computation costs. When data among different clients are not independently and identically distributed (non-IID), the performance could degrade significantly. In this work, we tackle the non-IID issue in FL-based ASR with personalized FL, which learns personalized models for each client. Concretely, we propose two types of personalized FL approaches for ASR. Firstly, we adapt the personalization layer based FL for ASR, which keeps some layers locally to learn personalization models. Secondly, to reduce the communication and computation costs, we propose decoupled federated learning (DecoupleFL). On one hand, DecoupleFL moves the computation burden to the server, thus decreasing the computation on clients. On the other hand, DecoupleFL communicates secure high-level features instead of model parameters, thus reducing communication cost when models are large. Experiments demonstrate two proposed personalized FL-based ASR approaches could reduce WER by 2.3% - 3.4% compared with FedAvg. Among them, DecoupleFL has only 11.4% communication and 75% computation cost compared with FedAvg, which is also significantly less than the personalization layer based FL.

Comments:	Accepted by Interspeech 2022
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC); Sound (cs.SD)
Cite as:	arXiv:2206.09102 [eess.AS]
	(or arXiv:2206.09102v1 [eess.AS] for this version)

Submission history

From: Han Zhu [view email]
[v1] Sat, 18 Jun 2022 03:44:37 GMT (374kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2206.09102

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Decoupled Federated Learning for ASR with Non-IID Data

Submission history