Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Data Leakage in Federated Averaging
(Submitted on 24 Jun 2022 (this version), latest version 1 Nov 2022 (v3))
Abstract: Recent attacks have shown that user data can be reconstructed from FedSGD updates, thus breaking privacy. However, these attacks are of limited practical relevance as federated learning typically uses the FedAvg algorithm. It is generally accepted that reconstructing data from FedAvg updates is much harder than FedSGD as: (i) there are unobserved intermediate weight updates, (ii) the order of inputs matters, and (iii) the order of labels changes every epoch. In this work, we propose a new optimization-based attack which successfully attacks FedAvg by addressing the above challenges. First, we solve the optimization problem using automatic differentiation that forces a simulation of the client's update for the reconstructed labels and inputs so as to match the received client update. Second, we address the unknown input order by treating images at different epochs as independent during optimization, while relating them with a permutation invariant prior. Third, we reconstruct the labels by estimating the parameters of existing FedSGD attacks at every FedAvg step. On the popular FEMNIST dataset, we demonstrate that on average we successfully reconstruct >45% of the client's images from realistic FedAvg updates computed on 10 local epochs of 10 batches each with 5 images, compared to only <10% using the baseline. These findings indicate that many real-world federated learning implementations based on FedAvg are vulnerable.
Submission history
From: Dimitar I. Dimitrov [view email][v1] Fri, 24 Jun 2022 17:51:02 GMT (38044kb,D)
[v2] Mon, 27 Jun 2022 16:05:25 GMT (38044kb,D)
[v3] Tue, 1 Nov 2022 16:37:06 GMT (38047kb,D)
Link back to: arXiv, form interface, contact.