Multi-channel Auto-encoder For Speech Emotion Recognition
2018 Β· Zefang Zong, Hao Li, Qi Wang
Abstract
Inferring emotion status from users' queries plays an important role to enhance the capacity in voice dialogues applications. Even though several related works obtained satisfactory results, the performance can still be further improved. In this paper, we proposed a novel framework named multi-channel auto-encoder (MTC-AE) on emotion recognition from acoustic information. MTC-AE contains multiple local DNNs based on different low-level descriptors with different statistics functions that are partly concatenated together, by which the structure is enabled to consider both local and global features simultaneously. Experiment based on a benchmark dataset IEMOCAP shows that our method significantly outperforms the existing state-of-the-art results, achieving \(64.8%\) leave-one-speaker-out unweighted accuracy, which is \(2.4%\) higher than the best result on this dataset.
Authors
(none)
Tags
Stats
Related papers
- Multi-task Semi-supervised Adversarial Autoencoding For Speech Emotion Recognition (2019)14.58
- An Ensemble Framework Of Voice-based Emotion Recognition System For Films And TV Programs (2018)9.41
- Emotech: A Multi-modal Speech Emotion Recognition Using Multi-source Low-level Information With Hybrid Recurrent Network (2025)8.35
- Attention Based Fully Convolutional Network For Speech Emotion Recognition (2018)15.25
- DNN-HMM Based Speaker Adaptive Emotion Recognition Using Proposed Epoch And MFCC Features (2018)14.11
- Attention-augmented End-to-end Multi-task Learning For Emotion Prediction From Speech (2019)13.50
- Adversarial Auto-encoders For Speech Based Emotion Recognition (2018)12.68
- Multimodal Speech Emotion Recognition And Ambiguity Resolution (2019)0.00