Information Fusion In Attention Networks Using Adaptive And Multi-level Factorized Bilinear Pooling For Audio-visual Emotion Recognition
2021 Β· Hengshun Zhou, Jun Du, Yuanyuan Zhang, et al.
Abstract
Multimodal emotion recognition is a challenging task in emotion computing as it is quite difficult to extract discriminative features to identify the subtle differences in human emotions with abstract concept and multiple expressions. Moreover, how to fully utilize both audio and visual information is still an open problem. In this paper, we propose a novel multimodal fusion attention network for audio-visual emotion recognition based on adaptive and multi-level factorized bilinear pooling (FBP). First, for the audio stream, a fully convolutional network (FCN) equipped with 1-D attention mechanism and local response normalization is designed for speech emotion recognition. Next, a global FBP (G-FBP) approach is presented to perform audio-visual information fusion by integrating selfattention based video stream with the proposed audio stream. To improve G-FBP, an adaptive strategy (AG-FBP) to dynamically calculate the fusion weight of two modalities is devised based on the emotion-relat
Authors
(none)
Tags
Stats
Related papers
- AMFFCN: Attentional Multi-layer Feature Fusion Convolution Network For Audio-visual Speech Enhancement (2021)0.00
- Multimodal Fusion With Deep Neural Networks For Audio-video Emotion Recognition (2019)0.00
- A Joint Cross-attention Model For Audio-visual Fusion In Dimensional Emotion Recognition (2022)18.00
- Multistage Linguistic Conditioning Of Convolutional Layers For Speech Emotion Recognition (2021)9.23
- Mutilmodal Feature Extraction And Attention-based Fusion For Emotion Estimation In Videos (2023)1.40
- Group Gated Fusion On Attention-based Bidirectional Alignment For Multimodal Emotion Recognition (2022)11.39
- Audio-guided Fusion Techniques For Multimodal Emotion Analysis (2024)4.52
- Enhancing Modal Fusion By Alignment And Label Matching For Multimodal Emotion Recognition (2024)6.34