Current browse context:
eess.AS
Change to browse by:
References & Citations
Electrical Engineering and Systems Science > Audio and Speech Processing
Title: A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
(Submitted on 8 Oct 2021 (v1), last revised 11 Oct 2021 (this version, v2))
Abstract: In this paper, we conduct a cross-dataset study on parametric and non-parametric raw-waveform based speaker embeddings through speaker verification experiments. In general, we observe a more significant performance degradation of these raw-waveform systems compared to spectral based systems. We then propose two strategies to improve the performance of raw-waveform based systems on cross-dataset tests. The first strategy is to change the real-valued filters into analytic filters to ensure shift-invariance. The second strategy is to apply variational dropout to non-parametric filters to prevent them from overfitting irrelevant nuance features.
Submission history
From: Ge Zhu [view email][v1] Fri, 8 Oct 2021 17:21:21 GMT (1038kb,D)
[v2] Mon, 11 Oct 2021 20:50:09 GMT (1034kb,D)
Link back to: arXiv, form interface, contact.