We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

eess.AS

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses

Authors: Viet Anh Trinh (1), Sebastian Braun (2) ((1) CUNY Graduate Center, (2) Microsoft Research)
Abstract: Speech enhancement has recently achieved great success with various deep learning methods. However, most conventional speech enhancement systems are trained with supervised methods that impose two significant challenges. First, a majority of training datasets for speech enhancement systems are synthetic. When mixing clean speech and noisy corpora to create the synthetic datasets, domain mismatches occur between synthetic and real-world recordings of noisy speech or audio. Second, there is a trade-off between increasing speech enhancement performance and degrading speech recognition (ASR) performance. Thus, we propose an unsupervised loss function to tackle those two problems. Our function is developed by extending the MixIT loss function with speech recognition embedding and disentanglement loss. Our results show that the proposed function effectively improves the speech enhancement performance compared to a baseline trained in a supervised way on the noisy VoxCeleb dataset. While fully unsupervised training is unable to exceed the corresponding baseline, with joint super- and unsupervised training, the system is able to achieve similar speech quality and better ASR performance than the best supervised baseline.
Comments: To appear in Proceeding of ICASSP 2022, May 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as: arXiv:2111.08678 [eess.AS]
  (or arXiv:2111.08678v2 [eess.AS] for this version)

Submission history

From: Viet Anh Trinh [view email]
[v1] Tue, 16 Nov 2021 18:23:47 GMT (24kb)
[v2] Sat, 19 Feb 2022 19:14:30 GMT (27kb)

Link back to: arXiv, form interface, contact.