We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

eess.AS

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Controllable and Interpretable Singing Voice Decomposition via Assem-VC

Abstract: We propose a singing decomposition system that encodes time-aligned linguistic content, pitch, and source speaker identity via Assem-VC. With decomposed speaker-independent information and the target speaker's embedding, we could synthesize the singing voice of the target speaker. In conclusion, we made a perfectly synced duet with the user's singing voice and the target singer's converted singing voice.
Comments: Accepted to NeurIPS Workshop on ML for Creativity and Design 2021 (Oral)
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as: arXiv:2110.12676 [eess.AS]
  (or arXiv:2110.12676v1 [eess.AS] for this version)

Submission history

From: Kang-Wook Kim [view email]
[v1] Mon, 25 Oct 2021 06:52:00 GMT (4712kb,D)

Link back to: arXiv, form interface, contact.