Augmentation Methods on Monophonic Audio for Instrument Classification in Polyphonic Music

Kratimenos, Agelos; Avramidis, Kleanthis; Garoufis, Christos; Zlatintsi, Athanasia; Maragos, Petros

doi:10.23919/Eusipco47968.2020.9287745

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1911

Computer Science > Machine Learning

Title: Augmentation Methods on Monophonic Audio for Instrument Classification in Polyphonic Music

Authors: Agelos Kratimenos, Kleanthis Avramidis, Christos Garoufis, Athanasia Zlatintsi, Petros Maragos

(Submitted on 28 Nov 2019 (v1), last revised 3 Mar 2020 (this version, v2))

Abstract: Instrument classification is one of the fields in Music Information Retrieval (MIR) that has attracted a lot of research interest. However, the majority of that is dealing with monophonic music, while efforts on polyphonic material mainly focus on predominant instrument recognition. In this paper, we propose an approach for instrument classification in polyphonic music from purely monophonic data, that involves performing data augmentation by mixing different audio segments. A variety of data augmentation techniques focusing on different sonic aspects, such as overlaying audio segments of the same genre, as well as pitch and tempo-based synchronization, are explored. We utilize Convolutional Neural Networks for the classification task, comparing shallow to deep network architectures. We further investigate the usage of a combination of the above classifiers, each trained on a single augmented dataset. An ensemble of VGG-like classifiers, trained on non-augmented, pitch-synchronized, tempo-synchronized and genre-similar excerpts, respectively, yields the best results, achieving slightly above 80% in terms of label ranking average precision (LRAP) in the IRMAS test set.ruments in over 2300 testing tracks.

Subjects:	Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
DOI:	10.23919/Eusipco47968.2020.9287745
Cite as:	arXiv:1911.12505 [cs.LG]
	(or arXiv:1911.12505v2 [cs.LG] for this version)

Submission history

From: Agelos Kratimenos [view email]
[v1] Thu, 28 Nov 2019 03:12:22 GMT (374kb)
[v2] Tue, 3 Mar 2020 02:19:47 GMT (1159kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1911.12505

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Augmentation Methods on Monophonic Audio for Instrument Classification in Polyphonic Music

Submission history