Sub-linear Memory Sketches For Near Neighbor Search On Streaming Data
2019 Β· Benjamin Coleman, Richard G. Baraniuk, Anshumali Shrivastava
Abstract
We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset. Our online sketching algorithm compresses an N element dataset to a sketch of size \(O(N^b log^3 N)\) in \(O(N^\{(b+1)\} log^3 N)\) time, where \(b < 1\). This sketch can correctly report the nearest neighbors of any query that satisfies a stability condition parameterized by \(b\). We achieve sublinear memory performance on stable queries by combining recent advances in locality sensitive hash (LSH)-based estimators, online kernel density estimation, and compressed sensing. Our theoretical results shed new light on the memory-accuracy tradeoff for nearest neighbor search, and our sketch, which consists entirely of short integer arrays, has a variety of attractive features in practice. We evaluate the memory-recall tradeoff of our method on a friend recommendation task in the Google Plus social media network. We obtain orders of magnitude better compression than the random proje
Authors
(none)
Tags
Stats
Related papers
- Streaming Binary Sketching Based On Subspace Tracking And Diagonal Uniformization (2017)0.00
- A Scalable Solution To The Nearest Neighbor Search Problem Through Local-search Methods On Neighbor Graphs (2017)3.58
- Associative Memories To Accelerate Approximate Nearest Neighbor Search (2016)6.34
- A Memory-efficient Sketch Method For Estimating High Similarities In Streaming Sets (2019)12.02
- Nearest Neighbor Search With Compact Codes: A Decoder Perspective (2021)3.58
- Worst-case Performance Of Popular Approximate Nearest Neighbor Search Implementations: Guarantees And Limitations (2023)5.84
- Experimental Analysis Of Locality Sensitive Hashing Techniques For High-dimensional Approximate Nearest Neighbor Searches (2020)6.34
- Mmlsh: A Practical And Efficient Technique For Processing Approximate Nearest Neighbor Queries On Multimedia Data (2020)4.52