One Shot Learning for Speech Separation

Wu, Yuan-Kuei; Huang, Kuan-Po; Tsao, Yu; Lee, Hung-yi

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2011

Computer Science > Sound

Title: One Shot Learning for Speech Separation

Authors: Yuan-Kuei Wu, Kuan-Po Huang, Yu Tsao, Hung-yi Lee

(Submitted on 20 Nov 2020 (v1), last revised 3 May 2021 (this version, v2))

Abstract: Despite the recent success of speech separation models, they fail to separate sources properly while facing different sets of people or noisy environments. To tackle this problem, we proposed to apply meta-learning to the speech separation task. We aimed to find a meta-initialization model, which can quickly adapt to new speakers by seeing only one mixture generated by those people. In this paper, we use model-agnostic meta-learning(MAML) algorithm and almost no inner loop(ANIL) algorithm in Conv-TasNet to achieve this goal. The experiment results show that our model can adapt not only to a new set of speakers but also noisy environments. Furthermore, we found out that the encoder and decoder serve as the feature-reuse layers, while the separator is the task-specific module.

Comments:	Accepted to ICASSP 2021
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2011.10233 [cs.SD]
	(or arXiv:2011.10233v2 [cs.SD] for this version)

Submission history

From: YuanKuei Wu [view email]
[v1] Fri, 20 Nov 2020 06:39:48 GMT (565kb,D)
[v2] Mon, 3 May 2021 11:23:17 GMT (569kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2011.10233

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: One Shot Learning for Speech Separation

Submission history