Reconstructing Training Data with Informed Adversaries

Balle, Borja; Cherubin, Giovanni; Hayes, Jamie

Full-text links:

Download:

Current browse context:

cs.CR

< prev | next >

new | recent | 2201

Computer Science > Cryptography and Security

Title: Reconstructing Training Data with Informed Adversaries

Authors: Borja Balle, Giovanni Cherubin, Jamie Hayes

(Submitted on 13 Jan 2022 (v1), last revised 25 Apr 2022 (this version, v2))

Abstract: Given access to a machine learning model, can an adversary reconstruct the model's training data? This work studies this question from the lens of a powerful informed adversary who knows all the training data points except one. By instantiating concrete attacks, we show it is feasible to reconstruct the remaining data point in this stringent threat model. For convex models (e.g. logistic regression), reconstruction attacks are simple and can be derived in closed-form. For more general models (e.g. neural networks), we propose an attack strategy based on training a reconstructor network that receives as input the weights of the model under attack and produces as output the target data point. We demonstrate the effectiveness of our attack on image classifiers trained on MNIST and CIFAR-10, and systematically investigate which factors of standard machine learning pipelines affect reconstruction success. Finally, we theoretically investigate what amount of differential privacy suffices to mitigate reconstruction attacks by informed adversaries. Our work provides an effective reconstruction attack that model developers can use to assess memorization of individual points in general settings beyond those considered in previous works (e.g. generative language models or access to training gradients); it shows that standard models have the capacity to store enough information to enable high-fidelity reconstruction of training data points; and it demonstrates that differential privacy can successfully mitigate such attacks in a parameter regime where utility degradation is minimal.

Comments:	Published at "2022 IEEE Symposium on Security and Privacy (SP)"
Subjects:	Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2201.04845 [cs.CR]
	(or arXiv:2201.04845v2 [cs.CR] for this version)

Submission history

From: Borja Balle [view email]
[v1] Thu, 13 Jan 2022 09:19:25 GMT (1932kb,D)
[v2] Mon, 25 Apr 2022 12:53:14 GMT (1934kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2201.04845

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Cryptography and Security

Title: Reconstructing Training Data with Informed Adversaries

Submission history