Current browse context:
cs.SD
Change to browse by:
References & Citations
Computer Science > Sound
Title: Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization
(Submitted on 16 Nov 2018 (v1), last revised 30 Apr 2019 (this version, v3))
Abstract: In this paper we address speaker-independent multichannel speech enhancement in unknown noisy environments. Our work is based on a well-established multichannel local Gaussian modeling framework. We propose to use a neural network for modeling the speech spectro-temporal content. The parameters of this supervised model are learned using the framework of variational autoencoders. The noisy recording environment is supposed to be unknown, so the noise spectro-temporal modeling remains unsupervised and is based on non-negative matrix factorization (NMF). We develop a Monte Carlo expectation-maximization algorithm and we experimentally show that the proposed approach outperforms its NMF-based counterpart, where speech is modeled using supervised NMF.
Submission history
From: Simon Leglaive [view email][v1] Fri, 16 Nov 2018 09:11:07 GMT (172kb,D)
[v2] Fri, 8 Feb 2019 14:42:47 GMT (169kb,D)
[v3] Tue, 30 Apr 2019 13:57:02 GMT (169kb,D)
Link back to: arXiv, form interface, contact.