Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition

Chaubey, Ashutosh; Sinha, Sparsh; Ghose, Susmita

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2306

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition

Authors: Ashutosh Chaubey, Sparsh Sinha, Susmita Ghose

(Submitted on 1 Jun 2023 (v1), last revised 30 Sep 2023 (this version, v2))

Abstract: Speaker identification systems are deployed in diverse environments, often different from the lab conditions on which they are trained and tested. In this paper, first, we show the problem of generalization using fixed thresholds (computed using EER metric) for imposter identification in unseen speaker recognition and then introduce a robust speaker-specific thresholding technique for better performance. Secondly, inspired by the recent use of meta-learning techniques in speaker verification, we propose an end-to-end meta-learning framework for imposter detection which decouples the problem of imposter detection from unseen speaker identification. Thus, unlike most prior works that use some heuristics to detect imposters, the proposed network learns to detect imposters by leveraging the utterances of the enrolled speakers. Furthermore, we show the efficacy of the proposed techniques on VoxCeleb1, VCTK and the FFSVC 2022 datasets, beating the baselines by up to 10%.

Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:2306.00952 [eess.AS]
	(or arXiv:2306.00952v2 [eess.AS] for this version)

Submission history

From: Ashutosh Chaubey [view email]
[v1] Thu, 1 Jun 2023 17:49:58 GMT (496kb,D)
[v2] Sat, 30 Sep 2023 19:35:49 GMT (887kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2306.00952

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition

Submission history