Distill-vq: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge From Dense Embeddings
2022 Β· Shitao Xiao, Zheng Liu, Weihao Han, et al.
Abstract
Vector quantization (VQ) based ANN indexes, such as Inverted File System (IVF) and Product Quantization (PQ), have been widely applied to embedding based document retrieval thanks to the competitive time and memory efficiency. Originally, VQ is learned to minimize the reconstruction loss, i.e., the distortions between the original dense embeddings and the reconstructed embeddings after quantization. Unfortunately, such an objective is inconsistent with the goal of selecting ground-truth documents for the input query, which may cause severe loss of retrieval quality. Recent works identify such a defect, and propose to minimize the retrieval loss through contrastive learning. However, these methods intensively rely on queries with ground-truth documents, whose performance is limited by the insufficiency of labeled data. In this paper, we propose Distill-VQ, which unifies the learning of IVF and PQ within a knowledge distillation framework. In Distill-VQ, the dense embeddings are levera
Authors
(none)
Tags
Stats
Related papers
- Jointly Optimizing Query Encoder And Product Quantization To Improve Retrieval Performance (2021)12.74
- Matching-oriented Product Quantization For Ad-hoc Retrieval (2021)2.29
- Beyond Product Quantization: Deep Progressive Quantization For Image Retrieval (2019)12.95
- Distilling Vision-language Pretraining For Efficient Cross-modal Retrieval (2024)0.00
- Aisaq: All-in-storage ANNS With Product Quantization For Dram-free Information Retrieval (2024)0.00
- Learning Discrete Representations Via Constrained Clustering For Effective And Efficient Dense Retrieval (2021)11.39
- Joint Learning Of Deep Retrieval Model And Product Quantization Based Embedding Index (2021)10.35
- Self-supervised Product Quantization For Deep Unsupervised Image Retrieval (2021)13.44