Scaling Laws For Embedding Dimension In Information Retrieval
2026 Β· Julian Killingback, Mahta Rafiee, Madine Manas, et al.
Abstract
Dense retrieval, which encodes queries and documents into a single dense vector, has become the dominant neural retrieval approach due to its simplicity and compatibility with fast approximate nearest neighbor algorithms. As the tasks dense retrieval performs grow in complexity, the fundamental limitations of the underlying data structure and similarity metric -- namely vectors and inner-products -- become more apparent. Prior recent work has shown theoretical limitations inherent to single vectors and inner-products that are generally tied to the embedding dimension. Given the importance of embedding dimension for retrieval capacity, understanding how dense retrieval performance changes as embedding dimension is scaled is fundamental to building next generation retrieval models that balance effectiveness and efficiency. In this work, we conduct a comprehensive analysis of the relationship between embedding dimension and retrieval performance. Our experiments include two model families
Authors
(none)
Tags
Stats
Related papers
- Scaling Laws For Dense Retrieval (2024)10.07
- On The Theoretical Limitations Of Embedding-based Retrieval (2025)0.00
- On Strengths And Limitations Of Single-vector Embeddings (2026)0.00
- Breaking The Curse Of Dimensionality: On The Stability Of Modern Vector Retrieval (2025)0.00
- Large Reasoning Embedding Models: Towards Next-generation Dense Retrieval Paradigm (2025)0.00
- Learning To Select: Query-aware Adaptive Dimension Selection For Dense Retrieval (2026)0.00
- Dense Retrievers Can Fail On Simple Queries: Revealing The Granularity Dilemma Of Embeddings (2025)2.86
- Dimension Reduction For Efficient Dense Retrieval Via Conditional Autoencoder (2022)8.13