Multimodal Unsupervised Domain Generalization By Retrieving Across The Modality Gap
2024 Β· Christopher Liao, Christian So, Theodoros Tsiligkaridis, et al.
Abstract
Domain generalization (DG) is an important problem that learns a model which generalizes to unseen test domains leveraging one or more source domains, under the assumption of shared label spaces. However, most DG methods assume access to abundant source data in the target label space, a requirement that proves overly stringent for numerous real-world applications, where acquiring the same label space as the target task is prohibitively expensive. For this setting, we tackle the multimodal version of the unsupervised domain generalization (MUDG) problem, which uses a large task-agnostic unlabeled source dataset during finetuning. Our framework does not explicitly assume any relationship between the source dataset and target task. Instead, it relies only on the premise that the source dataset can be accurately and efficiently searched in a joint vision-language space. We make three contributions in the MUDG setting. Firstly, we show theoretically that cross-modal approximate nearest neig
Authors
(none)
Tags
Stats
Related papers
- Universal Cross-domain Retrieval: Generalizing Across Classes And Domains (2021)8.09
- Learning Unseen Modality Interaction (2023)0.00
- GME: Improving Universal Multimodal Retrieval By Multimodal Llms (2024)0.00
- Unigraph2: Learning A Unified Embedding Space To Bind Multimodal Graphs (2025)6.77
- Test-time Training For Data-efficient UCDR (2022)0.00
- Modality Curation: Building Universal Embeddings For Advanced Multimodal Information Retrieval (2025)0.00
- Generalized Contrastive Learning For Universal Multimodal Retrieval (2025)0.00
- GPL: Generative Pseudo Labeling For Unsupervised Domain Adaptation Of Dense Retrieval (2021)17.47