Let All Be Whitened: Multi-teacher Distillation For Efficient Visual Retrieval
2023 Β· Zhe Ma, Jianfeng Dong, Shouling Ji, et al.
Abstract
Visual retrieval aims to search for the most relevant visual items, e.g., images and videos, from a candidate gallery with a given query item. Accuracy and efficiency are two competing objectives in retrieval tasks. Instead of crafting a new method pursuing further improvement on accuracy, in this paper we propose a multi-teacher distillation framework Whiten-MTD, which is able to transfer knowledge from off-the-shelf pre-trained retrieval models to a lightweight student model for efficient visual retrieval. Furthermore, we discover that the similarities obtained by different retrieval models are diversified and incommensurable, which makes it challenging to jointly distill knowledge from multiple models. Therefore, we propose to whiten the output of teacher models before fusion, which enables effective multi-teacher distillation for retrieval models. Whiten-MTD is conceptually simple and practically effective. Extensive experiments on two landmark image retrieval datasets and one vide
Authors
(none)
Tags
Stats
Related papers
- AMMKD: Adaptive Multimodal Multi-teacher Distillation For Lightweight Vision-language Models (2025)0.00
- Data-efficient Ranking Distillation For Image Retrieval (2020)0.00
- MCAD: Multi-teacher Cross-modal Alignment Distillation For Efficient Image-text Retrieval (2023)3.58
- TEACHTEXT: Crossmodal Generalized Distillation For Text-video Retrieval (2021)15.43
- Embeddistill: A Geometric Knowledge Distillation For Information Retrieval (2023)0.00
- Towards A Smaller Student: Capacity Dynamic Distillation For Efficient Image Retrieval (2023)10.07
- Context Unaware Knowledge Distillation For Image Retrieval (2022)0.60
- C2KD: Cross-lingual Cross-modal Knowledge Distillation For Multilingual Text-video Retrieval (2022)8.94