Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Hein, Matthias; Andriushchenko, Maksym

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1705

Computer Science > Machine Learning

Title: Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Authors: Matthias Hein, Maksym Andriushchenko

(Submitted on 23 May 2017 (v1), last revised 5 Nov 2017 (this version, v2))

Abstract: Recent work has shown that state-of-the-art classifiers are quite brittle, in the sense that a small adversarial change of an originally with high confidence correctly classified input leads to a wrong classification again with high confidence. This raises concerns that such classifiers are vulnerable to attacks and calls into question their usage in safety-critical systems. We show in this paper for the first time formal guarantees on the robustness of a classifier by giving instance-specific lower bounds on the norm of the input manipulation required to change the classifier decision. Based on this analysis we propose the Cross-Lipschitz regularization functional. We show that using this form of regularization in kernel methods resp. neural networks improves the robustness of the classifier without any loss in prediction performance.

Comments:	final version accepted at NIPS 2017, fixed bug in implementation of Cross-Lipschitz regularization and lower bound computation, now results are better
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:1705.08475 [cs.LG]
	(or arXiv:1705.08475v2 [cs.LG] for this version)

Submission history

From: Matthias Hein [view email]
[v1] Tue, 23 May 2017 18:48:20 GMT (659kb,D)
[v2] Sun, 5 Nov 2017 20:58:09 GMT (983kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1705.08475

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Submission history