Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification

Wang, Qing; Yao, Jixun; Wang, Ziqian; Guo, Pengcheng; Xie, Lei

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2305

Computer Science > Sound

Title: Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification

Authors: Qing Wang, Jixun Yao, Ziqian Wang, Pengcheng Guo, Lei Xie

(Submitted on 30 May 2023)

Abstract: In this study, we propose a timbre-reserved adversarial attack approach for speaker identification (SID) to not only exploit the weakness of the SID model but also preserve the timbre of the target speaker in a black-box attack setting. Particularly, we generate timbre-reserved fake audio by adding an adversarial constraint during the training of the voice conversion model. Then, we leverage a pseudo-Siamese network architecture to learn from the black-box SID model constraining both intrinsic similarity and structural similarity simultaneously. The intrinsic similarity loss is to learn an intrinsic invariance, while the structural similarity loss is to ensure that the substitute SID model shares a similar decision boundary to the fixed black-box SID model. The substitute model can be used as a proxy to generate timbre-reserved fake audio for attacking. Experimental results on the Audio Deepfake Detection (ADD) challenge dataset indicate that the attack success rate of our proposed approach yields up to 60.58% and 55.38% in the white-box and black-box scenarios, respectively, and can deceive both human beings and machines.

Comments:	5 pages
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2305.19020 [cs.SD]
	(or arXiv:2305.19020v1 [cs.SD] for this version)

Submission history

From: Qing Wang [view email]
[v1] Tue, 30 May 2023 13:20:31 GMT (1089kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2305.19020

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification

Submission history