Visualizing data augmentation in deep speaker recognition

Li, Pengqi; Li, Lantian; Hamdulla, Askar; Wang, Dong

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2305

Computer Science > Sound

Title: Visualizing data augmentation in deep speaker recognition

Authors: Pengqi Li, Lantian Li, Askar Hamdulla, Dong Wang

(Submitted on 25 May 2023)

Abstract: Visualization is of great value in understanding the internal mechanisms of neural networks. Previous work found that LayerCAM is a reliable visualization tool for deep speaker models. In this paper, we use LayerCAM to analyze the widely-adopted data augmentation (DA) approach, to understand how it leads to model robustness. We conduct experiments on the VoxCeleb1 dataset for speaker identification, which shows that both vanilla and activation-based (Act) DA approaches enhance robustness against interference, with Act DA being consistently superior. Visualization with LayerCAM suggests DA helps models learn to delete temporal-frequency (TF) bins that are corrupted by interference. The `learn to delete' behavior explained why DA models are more robust than clean models, and why the Act DA is superior over the vanilla DA when the interference is nontarget speech. However, LayerCAM still cannot clearly explain the superiority of Act DA in other situations, suggesting further research.

Comments:	to be published in INTERSPEECH 2023
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2305.16070 [cs.SD]
	(or arXiv:2305.16070v1 [cs.SD] for this version)

Submission history

From: Lantian Li Mr. [view email]
[v1] Thu, 25 May 2023 14:01:07 GMT (10775kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2305.16070

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Visualizing data augmentation in deep speaker recognition

Submission history