Unconventional Application Of K-means For Distributed Approximate Similarity Search
2022 · Felipe Ortega, Maria Jesus Algar, Isaac Martín de Diego, et al.
Abstract
Similarity search based on a distance function in metric spaces is a fundamental problem for many applications. Queries for similar objects lead to the well-known machine learning task of nearest-neighbours identification. Many data indexing strategies, collectively known as Metric Access Methods (MAM), have been proposed to speed up queries for similar elements in this context. Moreover, since exact approaches to solve similarity queries can be complex and time-consuming, alternative options have appeared to reduce query execution time, such as returning approximate results or resorting to distributed computing platforms. In this paper, we introduce MASK (Multilevel Approximate Similarity search with \(k\)-means), an unconventional application of the \(k\)-means algorithm as the foundation of a multilevel index structure for approximate similarity search, suitable for metric spaces. We show that inherent properties of \(k\)-means, like representing high-density data areas with fewer p
Authors
(none)
Tags
Stats
Related papers
- Scalable K-means Clustering For Large K Via Seeded Approximate Nearest-neighbor Search (2025)0.00
- Let Them Have CAKES: A Cutting-edge Algorithm For Scalable, Efficient, And Exact Search On Big Data (2023)2.68
- Indexing Metric Spaces For Exact Similarity Search (2020)10.85
- A Memory-efficient Distributed Algorithm For Approximate Nearest Neighbour Search With Arbitrary Distances (2024)0.00
- Exact Trajectory Similarity Search With N-tree: An Efficient Metric Index For Knn And Range Queries (2024)0.00
- Efficient Data-aware Distance Comparison Operations For High-dimensional Approximate Nearest Neighbor Search (2024)5.24
- Effective And General Distance Computation For Approximate Nearest Neighbor Search (2024)5.84
- Hd-index: Pushing The Scalability-accuracy Boundary For Approximate Knn Search In High-dimensional Spaces (2018)14.02