Sparse Ternary Codes For Similarity Search Have Higher Coding Gain Than Dense Binary Codes
2017 Β· Sohrab Ferdowsi, Slava Voloshynovskiy, Dimche Kostadinov, et al.
Abstract
This paper addresses the problem of Approximate Nearest Neighbor (ANN) search in pattern recognition where feature vectors in a database are encoded as compact codes in order to speed-up the similarity search in large-scale databases. Considering the ANN problem from an information-theoretic perspective, we interpret it as an encoding, which maps the original feature vectors to a less entropic sparse representation while requiring them to be as informative as possible. We then define the coding gain for ANN search using information-theoretic measures. We next show that the classical approach to this problem, which consists of binarization of the projected vectors is sub-optimal. Instead, a properly designed ternary encoding achieves higher coding gains and lower complexity.
Authors
(none)
Tags
Stats
Related papers
- A Multi-layer Network Based On Sparse Ternary Codes For Universal Vector Compression (2017)0.00
- Nearest Neighbor Search With Compact Codes: A Decoder Perspective (2021)3.58
- Polysemous Codes (2016)11.49
- Privacy-preserving Near Neighbor Search Via Sparse Coding With Ambiguation (2021)3.58
- Efficient And Effective Retrieval Of Dense-sparse Hybrid Vectors Using Graph-based Approximate Nearest Neighbor Search (2024)0.00
- Random Binary Trees For Approximate Nearest Neighbour Search In Binary Space (2017)2.26
- Compact Hash Codes For Efficient Visual Descriptors Retrieval In Large Scale Databases (2016)11.76
- From HNSW To Information-theoretic Binarization: Rethinking The Architecture Of Scalable Vector Search (2025)0.00