Controllable and Interpretable Singing Voice Decomposition via Assem-VC

Kim, Kang-wook; Lee, Junhyeok

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2110

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Controllable and Interpretable Singing Voice Decomposition via Assem-VC

Authors: Kang-wook Kim, Junhyeok Lee

(Submitted on 25 Oct 2021)

Abstract: We propose a singing decomposition system that encodes time-aligned linguistic content, pitch, and source speaker identity via Assem-VC. With decomposed speaker-independent information and the target speaker's embedding, we could synthesize the singing voice of the target speaker. In conclusion, we made a perfectly synced duet with the user's singing voice and the target singer's converted singing voice.

Comments:	Accepted to NeurIPS Workshop on ML for Creativity and Design 2021 (Oral)
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2110.12676 [eess.AS]
	(or arXiv:2110.12676v1 [eess.AS] for this version)

Submission history

From: Kang-Wook Kim [view email]
[v1] Mon, 25 Oct 2021 06:52:00 GMT (4712kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2110.12676

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Controllable and Interpretable Singing Voice Decomposition via Assem-VC

Submission history