Current browse context:
cs.SD
Change to browse by:
References & Citations
Computer Science > Sound
Title: Variational Autoencoders for Learning Latent Representations of Speech Emotion
(Submitted on 23 Dec 2017 (this version), latest version 28 Jul 2020 (v3))
Abstract: Latent representation of data in unsupervised fashion is a very interesting process. It provides more relevant features that can enhance the performance of a classifier. For speech emotion recognition tasks generating effective features is very crucial. Recently, deep generative models such as Variational Autoencoders (VAEs) have gained enormous success to model natural images. Being inspired by that in this paper, we use VAE for the modeling of emotions in human speech. We derive the latent representation of speech signal and use this for classification of emotions. We demonstrate that features learned by VAEs can achieve state-of-the-art emotion recognition results.
Submission history
From: Siddique Latif [view email][v1] Sat, 23 Dec 2017 03:54:00 GMT (38kb)
[v2] Mon, 26 Mar 2018 07:47:54 GMT (373kb,D)
[v3] Tue, 28 Jul 2020 01:35:27 GMT (373kb,D)
Link back to: arXiv, form interface, contact.