MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Tae, Jaesung; Kim, Hyeongju; Lee, Younggun

doi:10.1109/MLSP52302.2021.9596184

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2106

Change to browse by:

Computer Science > Sound

Title: MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Authors: Jaesung Tae, Hyeongju Kim, Younggun Lee

(Submitted on 15 Jun 2021 (v1), last revised 20 Nov 2021 (this version, v3))

Abstract: Recent developments in deep learning have significantly improved the quality of synthesized singing voice audio. However, prominent neural singing voice synthesis systems suffer from slow inference speed due to their autoregressive design. Inspired by MLP-Mixer, a novel architecture introduced in the vision literature for attention-free image classification, we propose MLP Singer, a parallel Korean singing voice synthesis system. To the best of our knowledge, this is the first work that uses an entirely MLP-based architecture for voice synthesis. Listening tests demonstrate that MLP Singer outperforms a larger autoregressive GAN-based system, both in terms of audio quality and synthesis speed. In particular, MLP Singer achieves a real-time factor of up to 200 and 3400 on CPUs and GPUs respectively, enabling order of magnitude faster generation on both environments.

Comments:	6 pages, 5 figures, 2 tables, IEEE MLSP 2021
Subjects:	Sound (cs.SD)
Journal reference:	2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)
DOI:	10.1109/MLSP52302.2021.9596184
Cite as:	arXiv:2106.07886 [cs.SD]
	(or arXiv:2106.07886v3 [cs.SD] for this version)

Submission history

From: Jaesung Tae [view email]
[v1] Tue, 15 Jun 2021 05:20:17 GMT (1787kb,D)
[v2] Mon, 5 Jul 2021 18:26:41 GMT (1788kb,D)
[v3] Sat, 20 Nov 2021 21:22:12 GMT (1787kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.07886

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Submission history