Learning Interpretable Representation for Controllable Polyphonic Music Generation

Wang, Ziyu; Wang, Dingsu; Zhang, Yixiao; Xia, Gus

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2008

Computer Science > Sound

Title: Learning Interpretable Representation for Controllable Polyphonic Music Generation

Authors: Ziyu Wang, Dingsu Wang, Yixiao Zhang, Gus Xia

(Submitted on 17 Aug 2020)

Abstract: While deep generative models have become the leading methods for algorithmic composition, it remains a challenging problem to control the generation process because the latent variables of most deep-learning models lack good interpretability. Inspired by the content-style disentanglement idea, we design a novel architecture, under the VAE framework, that effectively learns two interpretable latent factors of polyphonic music: chord and texture. The current model focuses on learning 8-beat long piano composition segments. We show that such chord-texture disentanglement provides a controllable generation pathway leading to a wide spectrum of applications, including compositional style transfer, texture variation, and accompaniment arrangement. Both objective and subjective evaluations show that our method achieves a successful disentanglement and high quality controlled music generation.

Subjects:	Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Journal reference:	In Proceedings of 21st International Conference on Music Information Retrieval (ISMIR), Montreal, Canada, 2020
Cite as:	arXiv:2008.07122 [cs.SD]
	(or arXiv:2008.07122v1 [cs.SD] for this version)

Submission history

From: Ziyu Wang [view email]
[v1] Mon, 17 Aug 2020 07:11:16 GMT (4279kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2008.07122

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Learning Interpretable Representation for Controllable Polyphonic Music Generation

Submission history