Embeddistill: A Geometric Knowledge Distillation For Information Retrieval
2023 Β· Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, et al.
Abstract
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR). In this paper, we aim to improve distillation methods that pave the way for the resource-efficient deployment of such models in practice. Inspired by our theoretical analysis of the teacher-student generalization gap for IR models, we propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model. Unlike existing teacher score-based distillation methods, our proposed approach employs embedding matching tasks to provide a stronger signal to align the representations of the teacher and student models. In addition, it utilizes query generation to explore the data manifold to reduce the discrepancies between the student and the teacher where training data is sparse. Furthermore, our analysis also motivates novel asymmetric architectures for student models which realizes better embedding alignment without i
Authors
(none)
Tags
Stats
Related papers
- Data-efficient Ranking Distillation For Image Retrieval (2020)0.00
- Knowledge Distillation In Document Retrieval (2019)0.00
- Pairdistill: Pairwise Relevance Distillation For Dense Retrieval (2024)7.24
- LEAF: Knowledge Distillation Of Text Embedding Models With Teacher-aligned Representations (2025)0.00
- Context Unaware Knowledge Distillation For Image Retrieval (2022)0.60
- Translate-distill: Learning Cross-language Dense Retrieval By Translation And Distillation (2024)8.60
- Learning Effective Representations For Retrieval Using Self-distillation With Adaptive Relevance Margins (2024)2.26
- Towards A Smaller Student: Capacity Dynamic Distillation For Efficient Image Retrieval (2023)10.07