Privacy for Free: How does Dataset Condensation Help Privacy?

Dong, Tian; Zhao, Bo; Lyu, Lingjuan

Full-text links:

Download:

Current browse context:

cs.CR

< prev | next >

new | recent | 2206

Computer Science > Cryptography and Security

Title: Privacy for Free: How does Dataset Condensation Help Privacy?

Authors: Tian Dong, Bo Zhao, Lingjuan Lyu

(Submitted on 1 Jun 2022)

Abstract: To prevent unintentional data leakage, research community has resorted to data generators that can produce differentially private data for model training. However, for the sake of the data privacy, existing solutions suffer from either expensive training cost or poor generalization performance. Therefore, we raise the question whether training efficiency and privacy can be achieved simultaneously. In this work, we for the first time identify that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free. To demonstrate the privacy benefit of DC, we build a connection between DC and differential privacy, and theoretically prove on linear feature extractors (and then extended to non-linear feature extractors) that the existence of one sample has limited impact ($O(m/n)$) on the parameter distribution of networks trained on $m$ samples synthesized from $n (n \gg m)$ raw samples by DC. We also empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihood-based membership inference attacks. We envision this work as a milestone for data-efficient and privacy-preserving machine learning.

Comments:	Accepted by ICML 2022 as Oral
Subjects:	Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2206.00240 [cs.CR]
	(or arXiv:2206.00240v1 [cs.CR] for this version)

Submission history

From: Tian Dong [view email]
[v1] Wed, 1 Jun 2022 05:39:57 GMT (3835kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2206.00240

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Cryptography and Security

Title: Privacy for Free: How does Dataset Condensation Help Privacy?

Submission history