Dynamic Adapter With Semantics Disentangling For Cross-lingual Cross-modal Retrieval
2024 Β· Rui Cai, Zhiyu Dong, Jianfeng Dong, et al.
Abstract
Existing cross-modal retrieval methods typically rely on large-scale vision-language pair data. This makes it challenging to efficiently develop a cross-modal retrieval model for under-resourced languages of interest. Therefore, Cross-lingual Cross-modal Retrieval (CCR), which aims to align vision and the low-resource language (the target language) without using any human-labeled target-language data, has gained increasing attention. As a general parameter-efficient way, a common solution is to utilize adapter modules to transfer the vision-language alignment ability of Vision-Language Pretraining (VLP) models from a source language to a target language. However, these adapters are usually static once learned, making it difficult to adapt to target-language captions with varied expressions. To alleviate it, we propose Dynamic Adapter with Semantics Disentangling (DASD), whose parameters are dynamically generated conditioned on the characteristics of the input captions. Considering that
Authors
(none)
Tags
Stats
Related papers
- Ucdr-adapter: Exploring Adaptation Of Pre-trained Vision-language Models For Universal Cross-domain Retrieval (2024)4.52
- Cross-modal Adapter: Parameter-efficient Transfer Learning Approach For Vision-language Models (2024)6.77
- Queryadapter: Rapid Adaptation Of Vision-language Models In Response To Natural Language Queries (2025)0.00
- Understanding Retrieval-augmented Task Adaptation For Vision-language Models (2024)0.00
- A Multimodal Recaptioning Framework To Account For Perceptual Diversity Across Languages In Vision-language Modeling (2025)0.00
- Vldeformer: Vision-language Decomposed Transformer For Fast Cross-modal Retrieval (2021)10.21
- Context-adaptive Multi-prompt Embedding With Large Language Models For Vision-language Alignment (2025)0.00
- SCA3D: Enhancing Cross-modal 3D Retrieval Via 3D Shape And Caption Paired Data Augmentation (2025)4.17