We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

eess.AS

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Acoustic scene classification using teacher-student learning with soft-labels

Abstract: Acoustic scene classification identifies an input segment into one of the pre-defined classes using spectral information. The spectral information of acoustic scenes may not be mutually exclusive due to common acoustic properties across different classes, such as babble noises included in both airports and shopping malls. However, conventional training procedure based on one-hot labels does not consider the similarities between different acoustic scenes. We exploit teacher-student learning with the purpose to derive soft-labels that consider common acoustic properties among different acoustic scenes. In teacher-student learning, the teacher network produces soft-labels, based on which the student network is trained. We investigate various methods to extract soft-labels that better represent similarities across different scenes. Such attempts include extracting soft-labels from multiple audio segments that are defined as an identical acoustic scene. Experimental results demonstrate the potential of our approach, showing a classification accuracy of 77.36 % on the DCASE 2018 task 1 validation set.
Comments: Accepted for presentation at Interspeech 2019
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as: arXiv:1904.10135 [eess.AS]
  (or arXiv:1904.10135v2 [eess.AS] for this version)

Submission history

From: Jee-Weon Jung [view email]
[v1] Tue, 23 Apr 2019 03:42:20 GMT (511kb,D)
[v2] Wed, 17 Jul 2019 04:10:52 GMT (512kb,D)

Link back to: arXiv, form interface, contact.