The Devil's Advocate: Shattering the Illusion of Unexploitable Data using Diffusion Models

Dolatabadi, Hadi M.; Erfani, Sarah; Leckie, Christopher

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2303

Computer Science > Machine Learning

Title: The Devil's Advocate: Shattering the Illusion of Unexploitable Data using Diffusion Models

Authors: Hadi M. Dolatabadi, Sarah Erfani, Christopher Leckie

(Submitted on 15 Mar 2023 (v1), last revised 11 Jan 2024 (this version, v2))

Abstract: Protecting personal data against exploitation of machine learning models is crucial. Recently, availability attacks have shown great promise to provide an extra layer of protection against the unauthorized use of data to train neural networks. These methods aim to add imperceptible noise to clean data so that the neural networks cannot extract meaningful patterns from the protected data, claiming that they can make personal data "unexploitable." This paper provides a strong countermeasure against such approaches, showing that unexploitable data might only be an illusion. In particular, we leverage the power of diffusion models and show that a carefully designed denoising process can counteract the effectiveness of the data-protecting perturbations. We rigorously analyze our algorithm, and theoretically prove that the amount of required denoising is directly related to the magnitude of the data-protecting perturbations. Our approach, called AVATAR, delivers state-of-the-art performance against a suite of recent availability attacks in various scenarios, outperforming adversarial training even under distribution mismatch between the diffusion model and the protected data. Our findings call for more research into making personal data unexploitable, showing that this goal is far from over. Our implementation is available at this repository: this https URL

Comments:	Accepted to the 2024 IEEE Conference on Secure and Trustworthy Machine Learning (SatML)
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2303.08500 [cs.LG]
	(or arXiv:2303.08500v2 [cs.LG] for this version)

Submission history

From: Hadi Mohaghegh Dolatabadi [view email]
[v1] Wed, 15 Mar 2023 10:20:49 GMT (28915kb,D)
[v2] Thu, 11 Jan 2024 03:12:36 GMT (39989kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2303.08500

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: The Devil's Advocate: Shattering the Illusion of Unexploitable Data using Diffusion Models

Submission history