Adversarial Purification through Representation Disentanglement

Bai, Tao; Zhao, Jun; Guo, Lanqing; Wen, Bihan

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2110

Computer Science > Computer Vision and Pattern Recognition

Title: Adversarial Purification through Representation Disentanglement

Authors: Tao Bai, Jun Zhao, Lanqing Guo, Bihan Wen

(Submitted on 15 Oct 2021)

Abstract: Deep learning models are vulnerable to adversarial examples and make incomprehensible mistakes, which puts a threat on their real-world deployment. Combined with the idea of adversarial training, preprocessing-based defenses are popular and convenient to use because of their task independence and good generalizability. Current defense methods, especially purification, tend to remove ``noise" by learning and recovering the natural images. However, different from random noise, the adversarial patterns are much easier to be overfitted during model training due to their strong correlation to the images. In this work, we propose a novel adversarial purification scheme by presenting disentanglement of natural images and adversarial perturbations as a preprocessing defense. With extensive experiments, our defense is shown to be generalizable and make significant protection against unseen strong adversarial attacks. It reduces the success rates of state-of-the-art \textbf{ensemble} attacks from \textbf{61.7\%} to \textbf{14.9\%} on average, superior to a number of existing methods. Notably, our defense restores the perturbed images perfectly and does not hurt the clean accuracy of backbone models, which is highly desirable in practice.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2110.07801 [cs.CV]
	(or arXiv:2110.07801v1 [cs.CV] for this version)

Submission history

From: Tao Bai [view email]
[v1] Fri, 15 Oct 2021 01:45:31 GMT (7658kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2110.07801

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Adversarial Purification through Representation Disentanglement

Submission history