Ucdr-adapter: Exploring Adaptation Of Pre-trained Vision-language Models For Universal Cross-domain Retrieval
2024 Β· Haoyu Jiang, Zhi-Qi Cheng, Gabriel Moreira, et al.
Abstract
Universal Cross-Domain Retrieval (UCDR) retrieves relevant images from unseen domains and classes without semantic labels, ensuring robust generalization. Existing methods commonly employ prompt tuning with pre-trained vision-language models but are inherently limited by static prompts, reducing adaptability. We propose UCDR-Adapter, which enhances pre-trained models with adapters and dynamic prompt generation through a two-phase training strategy. First, Source Adapter Learning integrates class semantics with domain-specific visual knowledge using a Learnable Textual Semantic Template and optimizes Class and Domain Prompts via momentum updates and dual loss functions for robust alignment. Second, Target Prompt Generation creates dynamic prompts by attending to masked source prompts, enabling seamless adaptation to unseen domains and classes. Unlike prior approaches, UCDR-Adapter dynamically adapts to evolving data distributions, enhancing both flexibility and generalization. During in
Authors
(none)
Tags
Stats
Related papers
- Queryadapter: Rapid Adaptation Of Vision-language Models In Response To Natural Language Queries (2025)0.00
- Pros: Prompting-to-simulate Generalized Knowledge For Universal Cross-domain Retrieval (2023)12.56
- Dynamic Adapter With Semantics Disentangling For Cross-lingual Cross-modal Retrieval (2024)2.26
- Understanding Retrieval-augmented Task Adaptation For Vision-language Models (2024)0.00
- Uniadapter: Unified Parameter-efficient Transfer Learning For Cross-modal Modeling (2023)3.77
- Cross-modal Adapter: Parameter-efficient Transfer Learning Approach For Vision-language Models (2024)6.77
- Test-time Training For Data-efficient UCDR (2022)0.00
- Mv-adapter: Multimodal Video Transfer Learning For Video Text Retrieval (2023)9.76