Abstract

We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset. Our online sketching algorithm compresses an N element dataset to a sketch of size \(O(N^b log^3 N)\) in \(O(N^\{(b+1)\} log^3 N)\) time, where \(b < 1\). This sketch can correctly report the nearest neighbors of any query that satisfies a stability condition parameterized by \(b\). We achieve sublinear memory performance on stable queries by combining recent advances in locality sensitive hash (LSH)-based estimators, online kernel density estimation, and compressed sensing. Our theoretical results shed new light on the memory-accuracy tradeoff for nearest neighbor search, and our sketch, which consists entirely of short integer arrays, has a variety of attractive features in practice. We evaluate the memory-recall tradeoff of our method on a friend recommendation task in the Google Plus social media network. We obtain orders of magnitude better compression than the random proje

Authors

(none)

Tags

  • ANN Search

Stats

  • citations0
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score0.00
  • arxiv keycoleman2019sub

Related papers