Perturbation Analysis of Gradient-based Adversarial Attacks

Ozbulak, Utku; Gasparyan, Manvel; De Neve, Wesley; Van Messem, Arnout

doi:10.1016/j.patrec.2020.04.034

Full-text links:

Download:

Current browse context:

stat

< prev | next >

new | recent | 2006

Computer Science > Machine Learning

Title: Perturbation Analysis of Gradient-based Adversarial Attacks

Authors: Utku Ozbulak, Manvel Gasparyan, Wesley De Neve, Arnout Van Messem

(Submitted on 2 Jun 2020)

Abstract: After the discovery of adversarial examples and their adverse effects on deep learning models, many studies focused on finding more diverse methods to generate these carefully crafted samples. Although empirical results on the effectiveness of adversarial example generation methods against defense mechanisms are discussed in detail in the literature, an in-depth study of the theoretical properties and the perturbation effectiveness of these adversarial attacks has largely been lacking. In this paper, we investigate the objective functions of three popular methods for adversarial example generation: the L-BFGS attack, the Iterative Fast Gradient Sign attack, and Carlini & Wagner's attack (CW). Specifically, we perform a comparative and formal analysis of the loss functions underlying the aforementioned attacks while laying out large-scale experimental results on ImageNet dataset. This analysis exposes (1) the faster optimization speed as well as the constrained optimization space of the cross-entropy loss, (2) the detrimental effects of using the signature of the cross-entropy loss on optimization precision as well as optimization space, and (3) the slow optimization speed of the logit loss in the context of adversariality. Our experiments reveal that the Iterative Fast Gradient Sign attack, which is thought to be fast for generating adversarial examples, is the worst attack in terms of the number of iterations required to create adversarial examples in the setting of equal perturbation. Moreover, our experiments show that the underlying loss function of CW, which is criticized for being substantially slower than other adversarial attacks, is not that much slower than other loss functions. Finally, we analyze how well neural networks can identify adversarial perturbations generated by the attacks under consideration, hereby revisiting the idea of adversarial retraining on ImageNet.

Comments:	Accepted for publication in Pattern Recognition Letters, 2020
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Journal reference:	Pattern Recognition Letters 2020, Volume 135, Pages 133-120
DOI:	10.1016/j.patrec.2020.04.034
Cite as:	arXiv:2006.01456 [cs.LG]
	(or arXiv:2006.01456v1 [cs.LG] for this version)

Submission history

From: Utku Ozbulak [view email]
[v1] Tue, 2 Jun 2020 08:51:37 GMT (4195kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.01456

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Computer Science > Machine Learning

Title: Perturbation Analysis of Gradient-based Adversarial Attacks

Submission history