Eigenemo: Spectral Utterance Representation Using Dynamic Mode Decomposition For Speech Emotion Classification
2020 Β· Shuiyang Mao, P. C. Ching, Tan Lee
Abstract
Human emotional speech is, by its very nature, a variant signal. This results in dynamics intrinsic to automatic emotion classification based on speech. In this work, we explore a spectral decomposition method stemming from fluid-dynamics, known as Dynamic Mode Decomposition (DMD), to computationally represent and analyze the global utterance-level dynamics of emotional speech. Specifically, segment-level emotion-specific representations are first learned through an Emotion Distillation process. This forms a multi-dimensional signal of emotion flow for each utterance, called Emotion Profiles (EPs). The DMD algorithm is then applied to the resultant EPs to capture the eigenfrequencies, and hence the fundamental transition dynamics of the emotion flow. Evaluation experiments using the proposed approach, which we call EigenEmo, show promising results. Moreover, due to the positive combination of their complementary properties, concatenating the utterance representations generated by Eigen
Authors
(none)
Tags
Stats
Related papers
- An Extended Variational Mode Decomposition Algorithm Developed Speech Emotion Recognition Performance (2023)6.34
- Msemotts: Multi-scale Emotion Transfer, Prediction, And Control For Emotional Speech Synthesis (2022)13.97
- Speech Emotion Recognition With Distilled Prosodic And Linguistic Affect Representations (2023)5.24
- Semantic Matters: Multimodal Features For Affective Analysis (2025)0.00
- Emodiarize: Speaker Diarization And Emotion Identification From Speech Signals Using Convolutional Neural Networks (2023)0.00
- Objective Human Affective Vocal Expression Detection And Automatic Classification With Stochastic Models And Learning Systems (2019)0.00
- Speech Emotion: Investigating Model Representations, Multi-task Learning And Knowledge Distillation (2022)6.34
- Emotech: A Multi-modal Speech Emotion Recognition Using Multi-source Low-level Information With Hybrid Recurrent Network (2025)8.35