Adaptive Retrieval And Scalable Indexing For K-nn Search With Cross-encoders
2024 Β· Nishant Yadav, Nicholas Monath, Manzil Zaheer, et al.
Abstract
Cross-encoder (CE) models which compute similarity by jointly encoding a query-item pair perform better than embedding-based models (dual-encoders) at estimating query-item relevance. Existing approaches perform k-NN search with CE by approximating the CE similarity with a vector embedding space fit either with dual-encoders (DE) or CUR matrix factorization. DE-based retrieve-and-rerank approaches suffer from poor recall on new domains and the retrieval with DE is decoupled from the CE. While CUR-based approaches can be more accurate than the DE-based approach, they require a prohibitively large number of CE calls to compute item embeddings, thus making it impractical for deployment at scale. In this paper, we address these shortcomings with our proposed sparse-matrix factorization based method that efficiently computes latent query and item embeddings to approximate CE scores and performs k-NN search with the approximate CE similarity. We compute item embeddings offline by factorizing
Authors
(none)
Tags
Stats
Related papers
- Efficient K-nn Search With Cross-encoders Using Adaptive Multi-round CUR Decomposition (2023)0.00
- Efficient Nearest Neighbor Search For Cross-encoder Models Using Matrix Factorization (2022)4.52
- Can Cross Encoders Produce Useful Sentence Embeddings? (2025)0.00
- Comparing Neighbors Together Makes It Easy: Jointly Comparing Multiple Candidates For Efficient And Effective Retrieval (2024)4.52
- Knn-embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval (2022)3.58
- Efficient Document Ranking With Learnable Late Interactions (2024)0.00
- Efficient Neural Ranking Using Forward Indexes And Lightweight Encoders (2023)5.24
- NUDGE: Lightweight Non-parametric Fine-tuning Of Embeddings For Retrieval (2024)0.00