Scalable Overload-aware Graph-based Index Construction For 10-billion-scale Vector Similarity Search
2025 Β· Yang Shi, Yiping Sun, Jiaolong Du, et al.
Abstract
Approximate Nearest Neighbor Search (ANNS) is essential for modern data-driven applications that require efficient retrieval of top-k results from massive vector databases. Although existing graph-based ANNS algorithms achieve a high recall rate on billion-scale datasets, their slow construction speed and limited scalability hinder their applicability to large-scale industrial scenarios. In this paper, we introduce SOGAIC, the first Scalable Overload-Aware Graph-Based ANNS Index Construction system tailored for ultra-large-scale vector databases: 1) We propose a dynamic data partitioning algorithm with overload constraints that adaptively introduces overlaps among subsets; 2) To enable efficient distributed subgraph construction, we employ a load-balancing task scheduling framework combined with an agglomerative merging strategy; 3) Extensive experiments on various datasets demonstrate a reduction of 47.3% in average construction time compared to existing methods. The proposed method h
Authors
(none)
Tags
Stats
Related papers
- DGAI: Decoupled On-disk Graph-based ANN Index For Efficient Updates And Queries (2025)0.00
- High Dimensional Similarity Search With Satellite System Graph: Efficiency, Scalability, And Unindexed Query Compatibility (2019)17.30
- Efficient And Effective Retrieval Of Dense-sparse Hybrid Vectors Using Graph-based Approximate Nearest Neighbor Search (2024)0.00
- Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph (2017)18.20
- Frequency-aware Graph Construction And Search For Dynamic Vector Databases (2025)0.00
- CAGRA: Highly Parallel Graph Construction And Approximate Nearest Neighbor Search For Gpus (2023)12.17
- Ood-diskann: Efficient And Scalable Graph ANNS For Out-of-distribution Queries (2022)0.00
- BAMG: A Block-aware Monotonic Graph Index For Disk-based Approximate Nearest Neighbor Search (2025)0.00