Locality-sensitive Hashing For F-divergences: Mutual Information Loss And Beyond
2019 Β· Lin Chen, Hossein Esfandiari, Thomas Fu, et al.
Abstract
Computing approximate nearest neighbors in high dimensional spaces is a central problem in large-scale data mining with a wide range of applications in machine learning and data science. A popular and effective technique in computing nearest neighbors approximately is the locality-sensitive hashing (LSH) scheme. In this paper, we aim to develop LSH schemes for distance functions that measure the distance between two probability distributions, particularly for f-divergences as well as a generalization to capture mutual information loss. First, we provide a general framework to design LHS schemes for f-divergence distance functions and develop LSH schemes for the generalized Jensen-Shannon divergence and triangular discrimination in this framework. We show a two-sided approximation result for approximation of the generalized Jensen-Shannon divergence by the Hellinger distance, which may be of independent interest. Next, we show a general method of reducing the problem of designing an LSH
Authors
(none)
Tags
Stats
Related papers
- Distance-sensitive Hashing (2017)8.82
- Experimental Analysis Of Locality Sensitive Hashing Techniques For High-dimensional Approximate Nearest Neighbor Searches (2020)6.34
- Improving Locality Sensitive Hashing By Efficiently Finding Projected Nearest Neighbors (2020)6.77
- Fast Locality-sensitive Hashing Frameworks For Approximate Near Neighbor Search (2017)7.81
- Improving Similarity Search With High-dimensional Locality-sensitive Hashing (2018)0.00
- Sharing Hash Codes For Multiple Purposes (2016)0.00
- Locality Sensitive Hashing For Set-queries, Motivated By Group Recommendations (2020)0.00
- Efficient Approximate Nearest Neighbor Search For Multiple Weighted \(l_{p\leq2}\) Distance Functions (2020)0.00