StyleWaveGAN: Style-based synthesis of drum sounds with extensive controls using generative adversarial networks

Lavault, Antoine; Roebel, Axel; Voiry, Matthieu

doi:10.5281/zenodo.6573360

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2204

Computer Science > Sound

Title: StyleWaveGAN: Style-based synthesis of drum sounds with extensive controls using generative adversarial networks

Authors: Antoine Lavault, Axel Roebel, Matthieu Voiry

(Submitted on 2 Apr 2022)

Abstract: In this paper we introduce StyleWaveGAN, a style-based drum sound generator that is a variation of StyleGAN, a state-of-the-art image generator. By conditioning StyleWaveGAN on both the type of drum and several audio descriptors, we are able to synthesize waveforms faster than real-time on a GPU directly in CD quality up to a duration of 1.5s while retaining a considerable amount of control over the generation. We also introduce an alternative to the progressive growing of GANs and experimented on the effect of dataset balancing for generative tasks. The experiments are carried out on an augmented subset of a publicly available dataset comprised of different drums and cymbals. We evaluate against two recent drum generators, WaveGAN and NeuroDrum, demonstrating significantly improved generation quality (measured with the Frechet Audio Distance) and interesting results with perceptual features.

Comments:	Accepted for publication in Sound and Music Computing 2022
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
DOI:	10.5281/zenodo.6573360
Cite as:	arXiv:2204.00907 [cs.SD]
	(or arXiv:2204.00907v1 [cs.SD] for this version)

Submission history

From: Antoine Lavault [view email]
[v1] Sat, 2 Apr 2022 17:27:17 GMT (418kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2204.00907

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: StyleWaveGAN: Style-based synthesis of drum sounds with extensive controls using generative adversarial networks

Submission history