Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

Li, Yanxiong; Wang, Wucheng; Chen, Hao; Cao, Wenchang; Li, Wei; He, Qianhua

Full-text links:

Download:

PDF only

Current browse context:

eess.AS

< prev | next >

new | recent | 2204

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

Authors: Yanxiong Li, Wucheng Wang, Hao Chen, Wenchang Cao, Wei Li, Qianhua He

(Submitted on 24 Apr 2022)

Abstract: Although few-shot learning has attracted much attention from the fields of image and audio classification, few efforts have been made on few-shot speaker identification. In the task of few-shot learning, overfitting is a tough problem mainly due to the mismatch between training and testing conditions. In this paper, we propose a few-shot speaker identification method which can alleviate the overfitting problem. In the proposed method, the model of a depthwise separable convolutional network with channel attention is trained with a prototypical loss function. Experimental datasets are extracted from three public speech corpora: Aishell-2, VoxCeleb1 and TORGO. Experimental results show that the proposed method exceeds state-of-the-art methods for few-shot speaker identification in terms of accuracy and F-score.

Comments:	Accepted by Odyssey 2022 (The Speaker and Language Recognition Workshop 2022, Beijing, China)
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2204.11180 [eess.AS]
	(or arXiv:2204.11180v1 [eess.AS] for this version)

Submission history

From: Yanxiong Li [view email]
[v1] Sun, 24 Apr 2022 03:31:05 GMT (1040kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2204.11180

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

Submission history