We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

eess.AS

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech

Abstract: Deep neural network (DNN)-based speech enhancement ordinarily requires clean speech signals as the training target. However, collecting clean signals is very costly because they must be recorded in a studio. This requirement currently restricts the amount of training data for speech enhancement to less than 1/1000 of that of speech recognition which does not need clean signals. Increasing the amount of training data is important for improving the performance, and hence the requirement of clean signals should be relaxed. In this paper, we propose a training strategy that does not require clean signals. The proposed method only utilizes noisy signals for training, which enables us to use a variety of speech signals in the wild. Our experimental results showed that the proposed method can achieve the performance similar to that of a DNN trained with clean signals.
Subjects: Audio and Speech Processing (eess.AS)
Cite as: arXiv:2101.08625 [eess.AS]
  (or arXiv:2101.08625v2 [eess.AS] for this version)

Submission history

From: Takuya Fujimura [view email]
[v1] Thu, 21 Jan 2021 14:14:21 GMT (659kb,D)
[v2] Mon, 10 May 2021 05:56:09 GMT (723kb,D)

Link back to: arXiv, form interface, contact.