Current browse context:
cs.CV
Change to browse by:
References & Citations
Computer Science > Computer Vision and Pattern Recognition
Title: Feature Losses for Adversarial Robustness
(Submitted on 10 Dec 2019)
Abstract: Deep learning has made tremendous advances in computer vision tasks such as image classification. However, recent studies have shown that deep learning models are vulnerable to specifically crafted adversarial inputs that are quasi-imperceptible to humans. In this work, we propose a novel approach to defending adversarial attacks. We employ an input processing technique based on denoising autoencoders as a defense. It has been shown that the input perturbations grow and accumulate as noise in feature maps while propagating through a convolutional neural network (CNN). We exploit the noisy feature maps by using an additional subnetwork to extract image feature maps and train an auto-encoder on perceptual losses of these feature maps. This technique achieves close to state-of-the-art results on defending MNIST and CIFAR10 datasets, but more importantly, shows a new way of employing a defense that cannot be trivially trained end-to-end by the attacker. Empirical results demonstrate the effectiveness of this approach on the MNIST and CIFAR10 datasets on simple as well as iterative LP attacks. Our method can be applied as a preprocessing technique to any off the shelf CNN.
Submission history
From: Kirthi Shankar Sivamani [view email][v1] Tue, 10 Dec 2019 04:58:45 GMT (844kb,D)
Link back to: arXiv, form interface, contact.