Siamese x-vector reconstruction for domain adapted speaker recognition

Rozenberg, Shai; Aronowitz, Hagai; Hoory, Ron

Full-text links:

Download:

PDF only

Current browse context:

eess.AS

< prev | next >

new | recent | 2007

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Siamese x-vector reconstruction for domain adapted speaker recognition

Authors: Shai Rozenberg, Hagai Aronowitz, Ron Hoory

(Submitted on 28 Jul 2020)

Abstract: With the rise of voice-activated applications, the need for speaker recognition is rapidly increasing. The x-vector, an embedding approach based on a deep neural network (DNN), is considered the state-of-the-art when proper end-to-end training is not feasible. However, the accuracy significantly decreases when recording conditions (noise, sample rate, etc.) are mismatched, either between the x-vector training data and the target data or between enrollment and test data. We introduce the Siamese x-vector Reconstruction (SVR) for domain adaptation. We reconstruct the embedding of a higher quality signal from a lower quality counterpart using a lean auxiliary Siamese DNN. We evaluate our method on several mismatch scenarios and demonstrate significant improvement over the baseline.

Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:2007.14146 [eess.AS]
	(or arXiv:2007.14146v1 [eess.AS] for this version)

Submission history

From: Shai Rozenberg [view email]
[v1] Tue, 28 Jul 2020 12:01:03 GMT (827kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2007.14146

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Siamese x-vector reconstruction for domain adapted speaker recognition

Submission history