Investigating Deep Neural Transformations for Spectrogram-based Musical Source Separation

Choi, Woosung; Kim, Minseok; Chung, Jaehwa; Jung, Daewon Lee Soonyoung

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 1912

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Investigating Deep Neural Transformations for Spectrogram-based Musical Source Separation

Authors: Woosung Choi, Minseok Kim, Jaehwa Chung, Daewon Lee Soonyoung Jung

(Submitted on 2 Dec 2019 (this version), latest version 8 Oct 2020 (v3))

Abstract: Musical Source Separation (MSS) is a signal processing task that tries to separate the mixed musical signal into each acoustic sound source, such as singing voice or drums. Recently many machine learning-based methods have been proposed for the MSS task, but there were no existing works that evaluate and directly compare various types of networks. In this paper, we aim to design a variety of neural transformation methods, including time-invariant methods, time-frequency methods, and mixtures of two different transformations. Our experiments provide abundant material for future works by comparing several transformation methods. We train our models on raw complex-valued STFT outputs and achieve state-of-the-art SDR performance in the MUSDB18 singing voice separation task by a large margin of 1.0 dB.

Comments:	8 pages 8 tables 9 figures under reviewing of ECAI 2020
Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Machine Learning (stat.ML)
Cite as:	arXiv:1912.02591 [eess.AS]
	(or arXiv:1912.02591v1 [eess.AS] for this version)

Submission history

From: Woosung Choi [view email]
[v1] Mon, 2 Dec 2019 07:46:19 GMT (1933kb,D)
[v2] Mon, 9 Dec 2019 13:56:59 GMT (1934kb,D)
[v3] Thu, 8 Oct 2020 16:39:49 GMT (1494kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:1912.02591v1

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Investigating Deep Neural Transformations for Spectrogram-based Musical Source Separation

Submission history