Retrieval-augmented Dynamic Prompt Tuning For Incomplete Multimodal Learning
2025 Β· Jian Lang, Zhangtao Cheng, Ting Zhong, et al.
Abstract
Multimodal learning with incomplete modality is practical and challenging. Recently, researchers have focused on enhancing the robustness of pre-trained MultiModal Transformers (MMTs) under missing modality conditions by applying learnable prompts. However, these prompt-based methods face several limitations: (1) incomplete modalities provide restricted modal cues for task-specific inference, (2) dummy imputation for missing content causes information loss and introduces noise, and (3) static prompts are instance-agnostic, offering limited knowledge for instances with various missing conditions. To address these issues, we propose RAGPT, a novel Retrieval-AuGmented dynamic Prompt Tuning framework. RAGPT comprises three modules: (I) the multi-channel retriever, which identifies similar instances through a within-modality retrieval strategy, (II) the missing modality generator, which recovers missing information using retrieved contexts, and (III) the context-aware prompter, which captur
Authors
(none)
Tags
Stats
Related papers
- DGL: Dynamic Global-local Prompt Tuning For Text-video Retrieval (2024)14.35
- Fine-grained Retrieval Prompt Tuning (2022)10.07
- Re-ranking The Context For Multimodal Retrieval Augmented Generation (2025)0.00
- Modular Retrieval For Generalization And Interpretation (2023)0.00
- CREM: Compression-driven Representation Enhancement For Multimodal Retrieval And Comprehension (2026)0.00
- MLLM Is A Strong Reranker: Advancing Multimodal Retrieval-augmented Generation Via Knowledge-enhanced Reranking And Noise-injected Training (2024)9.18
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text (2022)14.66
- Soft Prompt Tuning For Augmenting Dense Retrieval With Large Language Models (2023)9.41