Exploring Uncertainty In Conditional Multi-modal Retrieval Systems
2019 Β· Ahmed Taha, Yi-Ting Chen, Xitong Yang, et al.
Abstract
We cast visual retrieval as a regression problem by posing triplet loss as a regression loss. This enables epistemic uncertainty estimation using dropout as a Bayesian approximation framework in retrieval. Accordingly, Monte Carlo (MC) sampling is leveraged to boost retrieval performance. Our approach is evaluated on two applications: person re-identification and autonomous car driving. Comparable state-of-the-art results are achieved on multiple datasets for the former application. We leverage the Honda driving dataset (HDD) for autonomous car driving application. It provides multiple modalities and similarity notions for ego-motion action understanding. Hence, we present a multi-modal conditional retrieval network. It disentangles embeddings into separate representations to encode different similarities. This form of joint learning eliminates the need to train multiple independent networks without any performance degradation. Quantitative evaluation highlights our approach competen
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Data Uncertainty Learning In Visual Retrieval Systems (2019)0.00
- Bayesian Triplet Loss: Uncertainty Quantification In Image Retrieval (2020)11.49
- Uncertainty-based Cross-modal Retrieval With Probabilistic Representations (2022)0.00
- Universal Vision-language Dense Retrieval: Learning A Unified Representation Space For Multi-modal Retrieval (2022)3.45
- Prototype-based Aleatoric Uncertainty Quantification For Cross-modal Retrieval (2023)6.50
- Probabilistic Embeddings For Cross-modal Retrieval (2021)21.70
- Beyond Global Similarity: Towards Fine-grained, Multi-condition Multimodal Retrieval (2026)2.20
- Reasoning-augmented Representations For Multimodal Retrieval (2026)0.00