Salient Conditional Diffusion for Defending Against Backdoor Attacks

May, Brandon B.; Tatro, N. Joseph; Walker, Dylan; Kumar, Piyush; Shnidman, Nathan

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2301

Computer Science > Machine Learning

Title: Salient Conditional Diffusion for Defending Against Backdoor Attacks

Authors: Brandon B. May, N. Joseph Tatro, Dylan Walker, Piyush Kumar, Nathan Shnidman

(Submitted on 31 Jan 2023 (v1), last revised 19 May 2023 (this version, v2))

Abstract: We propose a novel algorithm, Salient Conditional Diffusion (Sancdifi), a state-of-the-art defense against backdoor attacks. Sancdifi uses a denoising diffusion probabilistic model (DDPM) to degrade an image with noise and then recover said image using the learned reverse diffusion. Critically, we compute saliency map-based masks to condition our diffusion, allowing for stronger diffusion on the most salient pixels by the DDPM. As a result, Sancdifi is highly effective at diffusing out triggers in data poisoned by backdoor attacks. At the same time, it reliably recovers salient features when applied to clean data. This performance is achieved without requiring access to the model parameters of the Trojan network, meaning Sancdifi operates as a black-box defense.

Comments:	14 pages, 5 figures. Edit: Added new baselines
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
ACM classes:	I.2
Cite as:	arXiv:2301.13862 [cs.LG]
	(or arXiv:2301.13862v2 [cs.LG] for this version)

Submission history

From: Norman Tatro [view email]
[v1] Tue, 31 Jan 2023 18:56:41 GMT (12996kb,D)
[v2] Fri, 19 May 2023 05:36:30 GMT (13003kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2301.13862

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Salient Conditional Diffusion for Defending Against Backdoor Attacks

Submission history