A Breakthrough In Speech Emotion Recognition Using Deep Retinal Convolution Neural Networks
2017 Β· Yafeng Niu, Dongsheng Zou, Yadong Niu, et al.
Abstract
Speech emotion recognition (SER) is to study the formation and change of speaker's emotional state from the speech signal perspective, so as to make the interaction between human and computer more intelligent. SER is a challenging task that has encountered the problem of less training data and low prediction accuracy. Here we propose a data augmentation algorithm based on the imaging principle of the retina and convex lens, to acquire the different sizes of spectrogram and increase the amount of training data by changing the distance between the spectrogram and the convex lens. Meanwhile, with the help of deep learning to get the high-level features, we propose the Deep Retinal Convolution Neural Networks (DRCNNs) for SER and achieve the average accuracy over 99%. The experimental results indicate that DRCNNs outperforms the previous studies in terms of both the number of emotions and the accuracy of recognition. Predictably, our results will dramatically improve human-computer interac
Authors
(none)
Tags
Stats
Related papers
- Hybrid Data Augmentation And Deep Attention-based Dilated Convolutional-recurrent Neural Networks For Speech Emotion Recognition (2021)12.81
- Improved Speech Emotion Recognition Using Transfer Learning And Spectrogram Augmentation (2021)12.74
- Towards Interpretable And Transferable Speech Emotion Recognition: Latent Representation Based Analysis Of Features, Methods And Corpora (2021)0.00
- Speech Emotion Recognition With Multiscale Area Attention And Data Augmentation (2021)13.65
- Searching For Effective Preprocessing Method And Cnn-based Architecture With Efficient Channel Attention On Speech Emotion Recognition (2024)2.26
- Deep Residual Local Feature Learning For Speech Emotion Recognition (2020)7.16
- Enhancing Speech Emotion Recognition Through Differentiable Architecture Search (2023)0.00
- Unsupervised Representations Improve Supervised Learning In Speech Emotion Recognition (2023)0.00