Attribution Regularization For Multimodal Paradigms
2024 Β· Sahiti Yerramilli, Jayant Sravan Tamarapalli, Jonathan Francis, et al.
Abstract
Multimodal machine learning has gained significant attention in recent years due to its potential for integrating information from multiple modalities to enhance learning and decision-making processes. However, it is commonly observed that unimodal models outperform multimodal models, despite the latter having access to richer information. Additionally, the influence of a single modality often dominates the decision-making process, resulting in suboptimal performance. This research project aims to address these challenges by proposing a novel regularization term that encourages multimodal models to effectively utilize information from all modalities when making decisions. The focus of this project lies in the video-audio domain, although the proposed regularization technique holds promise for broader applications in embodied AI research, where multiple modalities are involved. By leveraging this regularization term, the proposed approach aims to mitigate the issue of unimodal dominance
Authors
(none)
Tags
Stats
Related papers
- Contrastive Regularization For Multimodal Emotion Recognition Using Audio And Text (2022)0.00
- Towards Robust Multimodal Learning In The Open World (2025)0.00
- Quantifying Multimodal Imbalance: A Gmm-guided Adaptive Loss For Audio-visual Learning (2025)0.00
- Multi-task Regularization Based On Infrequent Classes For Audio Captioning (2020)0.00
- Analyzing Utility Of Visual Context In Multimodal Speech Recognition Under Noisy Conditions (2019)0.00
- A Modular End-to-end Multimodal Learning Method For Structured And Unstructured Data (2024)0.00
- Enhancing Multimodal Sentiment Analysis For Missing Modality Through Self-distillation And Unified Modality Cross-attention (2024)6.71
- Fine-grained Grounding For Multimodal Speech Recognition (2020)5.84