An Ensemble Framework Of Voice-based Emotion Recognition System For Films And TV Programs
2018 Β· Fei Tao, Gang Liu, Qingen Zhao
Abstract
Employing voice-based emotion recognition function in artificial intelligence (AI) product will improve the user experience. Most of researches that have been done only focus on the speech collected under controlled conditions. The scenarios evaluated in these research were well controlled. The conventional approach may fail when background noise or nonspeech filler exist. In this paper, we propose an ensemble framework combining several aspects of features from audio. The framework incorporates gender and speaker information relying on multi-task learning. Therefore it is able to dig and capture emotional information as much as possible. This framework is evaluated on multimodal emotion challenge (MEC) 2017 corpus which is close to real world. The proposed framework outperformed the best baseline system by 29.5% (relative improvement).
Authors
(none)
Tags
Stats
Related papers
- Multi-channel Auto-encoder For Speech Emotion Recognition (2018)0.00
- Human Vocal Sentiment Analysis (2019)0.00
- Framewise Approach In Multimodal Emotion Recognition In OMG Challenge (2018)0.00
- Emotion Recognition System From Speech And Visual Information Based On Convolutional Neural Networks (2020)10.21
- Emotech: A Multi-modal Speech Emotion Recognition Using Multi-source Low-level Information With Hybrid Recurrent Network (2025)8.35
- Emodiarize: Speaker Diarization And Emotion Identification From Speech Signals Using Convolutional Neural Networks (2023)0.00
- Speaker Verification In Emotional Talking Environments Based On Three-stage Framework (2018)6.34
- Emotion Recognition From Speech (2019)0.00