Parametric Resynthesis with neural vocoders

Maiti, Soumi; Mandel, Michael I

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 1906

Computer Science > Sound

Title: Parametric Resynthesis with neural vocoders

Authors: Soumi Maiti, Michael I Mandel

(Submitted on 16 Jun 2019 (v1), last revised 14 Nov 2019 (this version, v2))

Abstract: Noise suppression systems generally produce output speech with compromised quality. We propose to utilize the high quality speech generation capability of neural vocoders for noise suppression. We use a neural network to predict clean mel-spectrogram features from noisy speech and then compare two neural vocoders, WaveNet and WaveGlow, for synthesizing clean speech from the predicted mel spectrogram. Both WaveNet and WaveGlow achieve better subjective and objective quality scores than the source separation model Chimera++. Further, WaveNet and WaveGlow also achieve significantly better subjective quality ratings than the oracle Wiener mask. Moreover, we observe that between WaveNet and WaveGlow, WaveNet achieves the best subjective quality scores, although at the cost of much slower waveform generation.

Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1906.06762 [cs.SD]
	(or arXiv:1906.06762v2 [cs.SD] for this version)

Submission history

From: Soumi Maiti [view email]
[v1] Sun, 16 Jun 2019 20:17:23 GMT (80kb)
[v2] Thu, 14 Nov 2019 18:44:12 GMT (68kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1906.06762

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Parametric Resynthesis with neural vocoders

Submission history