We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Ancillary-file links:

Ancillary files (details):

Current browse context:

cs.SD

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Sound

Title: Blind Source Separation with Optimal Transport Non-negative Matrix Factorization

Abstract: Optimal transport as a loss for machine learning optimization problems has recently gained a lot of attention. Building upon recent advances in computational optimal transport, we develop an optimal transport non-negative matrix factorization (NMF) algorithm for supervised speech blind source separation (BSS). Optimal transport allows us to design and leverage a cost between short-time Fourier transform (STFT) spectrogram frequencies, which takes into account how humans perceive sound. We give empirical evidence that using our proposed optimal transport NMF leads to perceptually better results than Euclidean NMF, for both isolated voice reconstruction and BSS tasks. Finally, we demonstrate how to use optimal transport for cross domain sound processing tasks, where frequencies represented in the input spectrograms may be different from one spectrogram to another.
Comments: 22 pages, 7 figures, 2 additional files
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
DOI: 10.1186/s13634-018-0576-2
Cite as: arXiv:1802.05429 [cs.SD]
  (or arXiv:1802.05429v1 [cs.SD] for this version)

Submission history

From: Antoine Rolet [view email]
[v1] Thu, 15 Feb 2018 08:01:48 GMT (1611kb,A)

Link back to: arXiv, form interface, contact.