ML-SAN: Multi-level Speaker-adaptive Network For Emotion Recognition In Conversations

Abstract

arXiv:2604.25383v1 Announce Type: new Abstract: To establish empathy with machines, it is essential to fully understand human emotional changes. However, research in multimodal emotion recognition often overlooks one problem: individual expressive traits vary significantly, which means that different people may express emotions differently. In our daily lives, we can see this. When communicating with different people, some express "happiness" through their facial expressions and words, while others may hide their happiness or express it through their actions. Both are expressions of 'happiness,' but such differences in emotional expression are still too difficult for machines to distinguish. Current emotion recognition remains at a 'static' level, using a single recognition model to identify all emotional styles. This "simplification" often affects the recognition results, especially in multi-turn dialogues. To address this problem, this paper introduces a novel Multi-Level Speaker Ad

ML-SAN: Multi-level Speaker-adaptive Network For Emotion Recognition In Conversations

Abstract

Authors

Tags

Stats

Related papers