Robust Transferable Feature Extractors: Learning to Defend Pre-Trained Networks Against White Box Adversaries

Cann, Alexander; Colbert, Ian; Amer, Ihab

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2209

Computer Science > Machine Learning

Title: Robust Transferable Feature Extractors: Learning to Defend Pre-Trained Networks Against White Box Adversaries

Authors: Alexander Cann, Ian Colbert, Ihab Amer

(Submitted on 14 Sep 2022)

Abstract: The widespread adoption of deep neural networks in computer vision applications has brought forth a significant interest in adversarial robustness. Existing research has shown that maliciously perturbed inputs specifically tailored for a given model (i.e., adversarial examples) can be successfully transferred to another independently trained model to induce prediction errors. Moreover, this property of adversarial examples has been attributed to features derived from predictive patterns in the data distribution. Thus, we are motivated to investigate the following question: Can adversarial defenses, like adversarial examples, be successfully transferred to other independently trained models? To this end, we propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE). After examining theoretical motivation and implications, we experimentally show that our method can provide adversarial robustness to multiple independently pre-trained classifiers that are otherwise ineffective against an adaptive white box adversary. Furthermore, we show that RTFEs can even provide one-shot adversarial robustness to models independently trained on different datasets.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
Cite as:	arXiv:2209.06931 [cs.LG]
	(or arXiv:2209.06931v1 [cs.LG] for this version)

Submission history

From: Ian Colbert [view email]
[v1] Wed, 14 Sep 2022 21:09:34 GMT (625kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2209.06931

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Robust Transferable Feature Extractors: Learning to Defend Pre-Trained Networks Against White Box Adversaries

Submission history