STRATA: Simple, Gradient-Free Attacks for Models of Code

Springer, Jacob M.; Reinstadler, Bryn Marie; O'Reilly, Una-May

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2009

Computer Science > Machine Learning

Title: STRATA: Simple, Gradient-Free Attacks for Models of Code

Authors: Jacob M. Springer, Bryn Marie Reinstadler, Una-May O'Reilly

(Submitted on 28 Sep 2020 (v1), last revised 19 Aug 2021 (this version, v2))

Abstract: Neural networks are well-known to be vulnerable to imperceptible perturbations in the input, called adversarial examples, that result in misclassification. Generating adversarial examples for source code poses an additional challenge compared to the domains of images and natural language, because source code perturbations must retain the functional meaning of the code. We identify a striking relationship between token frequency statistics and learned token embeddings: the L2 norm of learned token embeddings increases with the frequency of the token except for the highest-frequnecy tokens. We leverage this relationship to construct a simple and efficient gradient-free method for generating state-of-the-art adversarial examples on models of code. Our method empirically outperforms competing gradient-based methods with less information and less computational effort.

Comments:	KDD'21 AdvML Workshop
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
Cite as:	arXiv:2009.13562 [cs.LG]
	(or arXiv:2009.13562v2 [cs.LG] for this version)

Submission history

From: Jacob Springer [view email]
[v1] Mon, 28 Sep 2020 18:21:19 GMT (1120kb,D)
[v2] Thu, 19 Aug 2021 20:20:34 GMT (1976kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2009.13562

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: STRATA: Simple, Gradient-Free Attacks for Models of Code

Submission history