Gatedxlstm: A Multimodal Affective Computing Approach For Emotion Recognition In Conversations
2025 Β· Yupei Li, Qiyang Sun, Sunil Munthumoduku Krishna Murthy, et al.
Abstract
Affective Computing (AC) is essential for advancing Artificial General Intelligence (AGI), with emotion recognition serving as a key component. However, human emotions are inherently dynamic, influenced not only by an individual's expressions but also by interactions with others, and single-modality approaches often fail to capture their full dynamics. Multimodal Emotion Recognition (MER) leverages multiple signals but traditionally relies on utterance-level analysis, overlooking the dynamic nature of emotions in conversations. Emotion Recognition in Conversation (ERC) addresses this limitation, yet existing methods struggle to align multimodal features and explain why emotions evolve within dialogues. To bridge this gap, we propose GatedxLSTM, a novel speech-text multimodal ERC model that explicitly considers voice and transcripts of both the speaker and their conversational partner(s) to identify the most influential sentences driving emotional shifts. By integrating Contrastive Lang
Authors
(none)
Tags
Stats
Related papers
- Conversational Emotion Analysis Via Attention Mechanisms (2019)10.35
- A Comprehensive Survey On Multi-modal Conversational Emotion Recognition With Deep Learning (2023)0.00
- Bemerc: Behavior-aware Mllm-based Framework For Multimodal Emotion Recognition In Conversation (2025)0.00
- ML-SAN: Multi-level Speaker-adaptive Network For Emotion Recognition In Conversations (2026)0.00
- LLM Supervised Pre-training For Multimodal Emotion Recognition In Conversations (2025)8.35
- Dynamic Graph Neural ODE Network For Multi-modal Emotion Recognition In Conversation (2024)0.00
- Gsdnet: Revisiting Incomplete Multimodal-diffusion From Graph Spectrum Perspective For Conversation Emotion Recognition (2025)0.00
- Multimodal Emotion Recognition And Sentiment Analysis In Multi-party Conversation Contexts (2025)0.00