Climber++: Pivot-based Approximate Similarity Search Over Big Data Series
2024 Β· Liang Zhang, Mohamed Y. Eltabakh, Elke A. Rundensteiner, et al.
Abstract
The generation and collection of big data series are becoming an integral part of many emerging applications in sciences, IoT, finance, and web applications among several others. The terabyte-scale of data series has motivated recent efforts to design fully distributed techniques for supporting operations such as approximate kNN similarity search, which is a building block operation in most analytics services on data series. Unfortunately, these techniques are heavily geared towards achieving scalability at the cost of sacrificing the results' accuracy. State-of-the-art systems report accuracy below 10% and 40%, respectively, which is not practical for many real-world applications. In this paper, we investigate the root problems in these existing techniques that limit their ability to achieve better a trade-off between scalability and accuracy. Then, we propose a framework, called CLIMBER, that encompasses a novel feature extraction mechanism, indexing scheme, and query processing algo
Authors
(none)
Tags
Stats
Related papers
- Return Of The Lernaean Hydra: Experimental Evaluation Of Data Series Approximate Similarity Search (2020)0.00
- Dumpyos: A Data-adaptive Multi-ary Index For Scalable Data Series Similarity Search (2024)5.24
- The Lernaean Hydra Of Data Series Similarity Search: An Experimental Evaluation Of The State Of The Art (2020)0.00
- Indexing Metric Spaces For Exact Similarity Search (2020)10.85
- Let Them Have CAKES: A Cutting-edge Algorithm For Scalable, Efficient, And Exact Search On Big Data (2023)2.68
- Hd-index: Pushing The Scalability-accuracy Boundary For Approximate Knn Search In High-dimensional Spaces (2018)14.02
- Pros: Data Series Progressive K-nn Similarity Search And Classification With Probabilistic Quality Guarantees (2022)7.81
- Ascent Similarity Caching With Approximate Indexes (2021)2.26