Low-complexity deep learning frameworks for acoustic scene classification

Pham, Lam; Ngo, Dat; Jalali, Anahid; Schindler, Alexander

Full-text links:

Download:

Current browse context:

eess

< prev | next >

new | recent | 2206

Computer Science > Sound

Title: Low-complexity deep learning frameworks for acoustic scene classification

Authors: Lam Pham, Dat Ngo, Anahid Jalali, Alexander Schindler

(Submitted on 13 Jun 2022)

Abstract: In this report, we presents low-complexity deep learning frameworks for acoustic scene classification (ASC). The proposed frameworks can be separated into four main steps: Front-end spectrogram extraction, online data augmentation, back-end classification, and late fusion of predicted probabilities. In particular, we initially transform audio recordings into Mel, Gammatone, and CQT spectrograms. Next, data augmentation methods of Random Cropping, Specaugment, and Mixup are then applied to generate augmented spectrograms before being fed into deep learning based classifiers. Finally, to achieve the best performance, we fuse probabilities which obtained from three individual classifiers, which are independently-trained with three type of spectrograms. Our experiments conducted on DCASE 2022 Task 1 Development dataset have fullfiled the requirement of low-complexity and achieved the best classification accuracy of 60.1%, improving DCASE baseline by 17.2%.

Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2206.06057 [cs.SD]
	(or arXiv:2206.06057v1 [cs.SD] for this version)

Submission history

From: Lam Pham [view email]
[v1] Mon, 13 Jun 2022 11:41:39 GMT (742kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2206.06057

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Computer Science > Sound

Title: Low-complexity deep learning frameworks for acoustic scene classification

Submission history