Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

Chen, Guangke; Zhao, Zhe; Song, Fu; Chen, Sen; Fan, Lingling; Wang, Feng; Wang, Jiashui

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2206

Computer Science > Sound

Title: Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

Authors: Guangke Chen, Zhe Zhao, Fu Song, Sen Chen, Lingling Fan, Feng Wang, Jiashui Wang

(Submitted on 7 Jun 2022)

Abstract: Speaker recognition systems (SRSs) have recently been shown to be vulnerable to adversarial attacks, raising significant security concerns. In this work, we systematically investigate transformation and adversarial training based defenses for securing SRSs. According to the characteristic of SRSs, we present 22 diverse transformations and thoroughly evaluate them using 7 recent promising adversarial attacks (4 white-box and 3 black-box) on speaker recognition. With careful regard for best practices in defense evaluations, we analyze the strength of transformations to withstand adaptive attacks. We also evaluate and understand their effectiveness against adaptive attacks when combined with adversarial training. Our study provides lots of useful insights and findings, many of them are new or inconsistent with the conclusions in the image and speech recognition domains, e.g., variable and constant bit rate speech compressions have different performance, and some non-differentiable transformations remain effective against current promising evasion techniques which often work well in the image domain. We demonstrate that the proposed novel feature-level transformation combined with adversarial training is rather effective compared to the sole adversarial training in a complete white-box setting, e.g., increasing the accuracy by 13.62% and attack cost by two orders of magnitude, while other transformations do not necessarily improve the overall defense capability. This work sheds further light on the research directions in this field. We also release our evaluation platform SPEAKERGUARD to foster further research.

Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2206.03393 [cs.SD]
	(or arXiv:2206.03393v1 [cs.SD] for this version)

Submission history

From: Fu Song [view email]
[v1] Tue, 7 Jun 2022 15:38:27 GMT (6982kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2206.03393

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Computer Science > Sound

Title: Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

Submission history