A Comprehensive Survey On Multi-modal Conversational Emotion Recognition With Deep Learning
2023 Β· Yuntao Shou, Tao Meng, Wei Ai, et al.
Abstract
Multi-modal conversation emotion recognition (MCER) aims to recognize and track the speaker's emotional state using text, speech, and visual information in the conversation scene. Analyzing and studying MCER issues is significant to affective computing, intelligent recommendations, and human-computer interaction fields. Unlike the traditional single-utterance multi-modal emotion recognition or single-modal conversation emotion recognition, MCER is a more challenging problem that needs to deal with more complex emotional interaction relationships. The critical issue is learning consistency and complementary semantics for multi-modal feature fusion based on emotional interaction relationships. To solve this problem, people have conducted extensive research on MCER based on deep learning technology, but there is still a lack of systematic review of the modeling methods. Therefore, a timely and comprehensive overview of MCER's recent advances in deep learning is of great significance to ac
Authors
(none)
Tags
Stats
Related papers
- Gatedxlstm: A Multimodal Affective Computing Approach For Emotion Recognition In Conversations (2025)0.00
- Multimodal Emotion Recognition And Sentiment Analysis In Multi-party Conversation Contexts (2025)0.00
- Bemerc: Behavior-aware Mllm-based Framework For Multimodal Emotion Recognition In Conversation (2025)0.00
- MMER: Multimodal Multi-task Learning For Speech Emotion Recognition (2022)10.07
- Agent-based Modular Learning For Multimodal Emotion Recognition In Human-agent Systems (2025)0.00
- Exploring Attention Mechanisms For Multimodal Emotion Recognition In An Emergency Call Center Corpus (2023)8.09
- Emotech: A Multi-modal Speech Emotion Recognition Using Multi-source Low-level Information With Hybrid Recurrent Network (2025)8.35
- Multimodal Speech Emotion Recognition Using Audio And Text (2018)18.02