Deep learning for speech emotion feature extraction and classification: current trends and future directions

Main Article Content

Himashri Deka, Vikas Mittal

Abstract

Speech Emotion Recognition (SER) systems have been very useful in different domains like social media, customer satisfaction, etc. Traditional SER systems use old datasets and feature extraction techniques that render the recognition less reliable and robust. In recent times, with the advancement of deep learning algorithms and the production of massive amounts of speech-emotion data, the use of unsupervised learning has been widely used to automate feature extrac- tions. Additionally, deep learning algorithms have also been proven very effective for emotion classification and transfer learning of emotions. In this paper, we cover recent trends of deep learning algorithms in speech emotion feature extrac- tion and classification, along with their comparative study. We also present the advantages of deep learning algorithms in building SER over traditional SER. Further, we have also compared the advantages of unsupervised learning over supervised learning due to the variation in speech emotion data. We have also dis- cussed popular datasets along with some recently developed datasets to leverage the development of SER.


 


 

Article Details

Section
Articles
Author Biography

Himashri Deka, Vikas Mittal