Stars: Tera-scale Graph Building For Clustering And Graph Learning
2022 Β· Cj Carey, Jonathan Halcrow, Rajesh Jayaram, et al.
Abstract
A fundamental procedure in the analysis of massive datasets is the construction of similarity graphs. Such graphs play a key role for many downstream tasks, including clustering, classification, graph learning, and nearest neighbor search. For these tasks, it is critical to build graphs which are sparse yet still representative of the underlying data. The benefits of sparsity are twofold: firstly, constructing dense graphs is infeasible in practice for large datasets, and secondly, the runtime of downstream tasks is directly influenced by the sparsity of the similarity graph. In this work, we present \(\textit\{Stars\}\): a highly scalable method for building extremely sparse graphs via two-hop spanners, which are graphs where similar points are connected by a path of length at most two. Stars can construct two-hop spanners with significantly fewer similarity comparisons, which are a major bottleneck for learning based models where comparisons are expensive to evaluate. Theoretically,
Authors
(none)
Tags
Stats
Related papers
- Grale: Designing Networks For Graph Learning (2020)10.85
- Large-scale Graph Building In Dynamic Environments: Low Latency And High Quality (2025)0.00
- Visualizing Large-scale And High-dimensional Data (2016)18.48
- Deep Graph Similarity Learning: A Survey (2019)13.97
- Co-embedding: Discovering Communities On Bipartite Graphs Through Projection (2021)0.00
- Super-sparse Learning In Similarity Spaces (2017)5.24
- STAR: A Simple Training-free Approach For Recommendations Using Large Language Models (2024)0.00
- PECANN: Parallel Efficient Clustering With Graph-based Approximate Nearest Neighbor Search (2023)0.00