Dimension Vs. Precision: A Comparative Analysis Of Autoencoders And Quantization For Efficient Vector Retrieval On BEIR Scifact
2025 · Satyanarayan Pati
Abstract
Dense retrieval models have become a standard for state-of-the-art information retrieval. However, their high-dimensional, high-precision (float32) vector embeddings create significant storage and memory challenges for real-world deployment. To address this, we conduct a rigorous empirical study on the BEIR SciFact benchmark, evaluating the trade-offs between two primary compression strategies: (1) Dimensionality Reduction via deep Autoencoders (AE), reducing original 384-dim vectors to latent spaces from 384 down to 12, and (2) Precision Reduction via Quantization (float16, int8, and binary). We systematically compare each method by measuring the "performance loss" (or gain) relative to a float32 baseline across a full suite of retrieval metrics (NDCG, MAP, MRR, Recall, Precision) at various k cutoffs. Our results show that int8 scalar quantization provides the most effective "sweet spot," achieving a 4x compression with a negligible [~1-2%] drop in nDCG@10. In contrast, Autoencoders
Authors
(none)
Tags
Stats
Related papers
- Optimization Of Embeddings Storage For RAG Systems Using Quantization And Dimensionality Reduction Techniques (2025)0.00
- Dimension Reduction For Efficient Dense Retrieval Via Conditional Autoencoder (2022)8.13
- Jointly Optimizing Query Encoder And Product Quantization To Improve Retrieval Performance (2021)12.74
- Semantic Certainty Assessment In Vector Retrieval Systems: A Novel Framework For Embedding Quality Evaluation (2025)0.00
- Lossless Compression Of Vector Ids For Approximate Nearest Neighbor Search (2025)11.11
- Scaling Laws For Embedding Dimension In Information Retrieval (2026)0.00
- Mixed-precision Embeddings For Large-scale Recommendation Models (2024)0.00
- Corect: A Framework For Evaluating Embedding Compression Techniques At Scale (2025)0.00