WaveBeat: End-to-end beat and downbeat tracking in the time domain

Steinmetz, Christian J.; Reiss, Joshua D.

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2110

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: WaveBeat: End-to-end beat and downbeat tracking in the time domain

Authors: Christian J. Steinmetz, Joshua D. Reiss

(Submitted on 4 Oct 2021)

Abstract: Deep learning approaches for beat and downbeat tracking have brought advancements. However, these approaches continue to rely on hand-crafted, subsampled spectral features as input, restricting the information available to the model. In this work, we propose WaveBeat, an end-to-end approach for joint beat and downbeat tracking operating directly on waveforms. This method forgoes engineered spectral features, and instead, produces beat and downbeat predictions directly from the waveform, the first of its kind for this task. Our model utilizes temporal convolutional networks (TCNs) operating on waveforms that achieve a very large receptive field ($\geq$ 30 s) at audio sample rates in a memory efficient manner by employing rapidly growing dilation factors with fewer layers. With a straightforward data augmentation strategy, our method outperforms previous state-of-the-art methods on some datasets, while producing comparable results on others, demonstrating the potential for time domain approaches.

Comments:	To appear at the 151st AES Convention
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2110.01436 [eess.AS]
	(or arXiv:2110.01436v1 [eess.AS] for this version)

Submission history

From: Christian Steinmetz [view email]
[v1] Mon, 4 Oct 2021 13:31:42 GMT (281kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2110.01436

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: WaveBeat: End-to-end beat and downbeat tracking in the time domain

Submission history