Detecting Emotion Carriers By Combining Acoustic And Lexical Representations
2021 Β· Sebastian P. Bayerl, Aniruddha Tammewar, Korbinian Riedhammer, et al.
Abstract
Personal narratives (PN) - spoken or written - are recollections of facts, people, events, and thoughts from one's own experience. Emotion recognition and sentiment analysis tasks are usually defined at the utterance or document level. However, in this work, we focus on Emotion Carriers (EC) defined as the segments (speech or text) that best explain the emotional state of the narrator ("loss of father", "made me choose"). Once extracted, such EC can provide a richer representation of the user state to improve natural language understanding and dialogue modeling. In previous work, it has been shown that EC can be identified using lexical features. However, spoken narratives should provide a richer description of the context and the users' emotional state. In this paper, we leverage word-based acoustic and textual embeddings as well as early and late fusion techniques for the detection of ECs in spoken narratives. For the acoustic word-level representations, we use Residual Neural Networ
Authors
(none)
Tags
Stats
Related papers
- Fusion Approaches For Emotion Recognition From Speech Using Acoustic And Text-based Features (2024)12.25
- Leveraging Content And Acoustic Representations For Speech Emotion Recognition (2024)2.26
- Emodiarize: Speaker Diarization And Emotion Identification From Speech Signals Using Convolutional Neural Networks (2023)0.00
- Semantic Matters: Multimodal Features For Affective Analysis (2025)0.00
- Vocal Style Factorization For Effective Speaker Recognition In Affective Scenarios (2023)0.00
- Embedded Emotions -- A Data Driven Approach To Learn Transferable Feature Representations From Raw Speech Input For Emotion Recognition (2020)0.00
- EMNS /imz/ Corpus: An Emotive Single-speaker Dataset For Narrative Storytelling In Games, Television And Graphic Novels (2023)0.00
- Emotech: A Multi-modal Speech Emotion Recognition Using Multi-source Low-level Information With Hybrid Recurrent Network (2025)8.35