C2KD: Cross-lingual Cross-modal Knowledge Distillation For Multilingual Text-video Retrieval
2022 Β· Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, et al.
Abstract
Multilingual text-video retrieval methods have improved significantly in recent years, but the performance for other languages lags behind English. We propose a Cross-Lingual Cross-Modal Knowledge Distillation method to improve multilingual text-video retrieval. Inspired by the fact that English text-video retrieval outperforms other languages, we train a student model using input text in different languages to match the cross-modal predictions from teacher models using input text in English. We propose a cross entropy based objective which forces the distribution over the student's text-video similarity scores to be similar to those of the teacher models. We introduce a new multilingual video dataset, Multi-YouCook2, by translating the English captions in the YouCook2 video dataset to 8 other languages. Our method improves multilingual text-video retrieval performance on Multi-YouCook2 and several other datasets such as Multi-MSRVTT and VATEX. We also conducted an analysis on the effe
Authors
(none)
Tags
Stats
Related papers
- TEACHTEXT: Crossmodal Generalized Distillation For Text-video Retrieval (2021)15.43
- Translate-distill: Learning Cross-language Dense Retrieval By Translation And Distillation (2024)8.60
- MCAD: Multi-teacher Cross-modal Alignment Distillation For Efficient Image-text Retrieval (2023)3.58
- CL2CM: Improving Cross-lingual Cross-modal Retrieval Via Cross-lingual Knowledge Transfer (2023)8.60
- Graph-based Cross-domain Knowledge Distillation For Cross-dataset Text-to-image Person Retrieval (2025)3.58
- AMMKD: Adaptive Multimodal Multi-teacher Distillation For Lightweight Vision-language Models (2025)0.00
- MDMMT-2: Multidomain Multimodal Transformer For Video Retrieval, One More Step Towards Generalization (2022)0.00
- Dual Learning With Dynamic Knowledge Distillation And Soft Alignment For Partially Relevant Video Retrieval (2025)2.60