Fast Cosine Similarity Search In Binary Space With Angular Multi-index Hashing
2016 Β· Sepehr Eghbali, Ladan Tahvildari
Abstract
Given a large dataset of binary codes and a binary query point, we address how to efficiently find \(K\) codes in the dataset that yield the largest cosine similarities to the query. The straightforward answer to this problem is to compare the query with all items in the dataset, but this is practical only for small datasets. One potential solution to enhance the search time and achieve sublinear cost is to use a hash table populated with binary codes of the dataset and then look up the nearby buckets to the query to retrieve the nearest neighbors. However, if codes are compared in terms of cosine similarity rather than the Hamming distance, then the main issue is that the order of buckets to probe is not evident. To examine this issue, we first elaborate on the connection between the Hamming distance and the cosine similarity. Doing this allows us to systematically find the probing sequence in the hash table. However, solving the nearest neighbor search with a single table is only pra
Authors
(none)
Tags
Stats
Related papers
- Fast Search On Binary Codes By Weighted Hamming Distance (2020)0.00
- Fast Top-k Cosine Similarity Search Through Xor-friendly Binary Quantization On Gpus (2020)0.00
- Interleaved Composite Quantization For High-dimensional Similarity Search (2019)0.00
- Polysemous Codes (2016)11.49
- Learning To Hash With Semantic Similarity Metrics And Empirical KL Divergence (2020)0.00
- Query-adaptive Hash Code Ranking For Large-scale Multi-view Visual Search (2019)13.74
- A Non-alternating Graph Hashing Algorithm For Large Scale Image Search (2020)4.52
- Cluster-wise Unsupervised Hashing For Cross-modal Similarity Search (2019)11.39