On Design Choices In Similarity-preserving Sparse Randomized Embeddings
2024 Β· Denis Kleyko, Dmitri A. Rachkovskij
Abstract
Expand & Sparsify is a principle that is observed in anatomically similar neural circuits found in the mushroom body (insects) and the cerebellum (mammals). Sensory data are projected randomly to much higher-dimensionality (expand part) where only few the most strongly excited neurons are activated (sparsify part). This principle has been leveraged to design a FlyHash algorithm that forms similarity-preserving sparse embeddings, which have been found useful for such tasks as novelty detection, pattern recognition, and similarity search. Despite its simplicity, FlyHash has a number of design choices to be set such as preprocessing of the input data, choice of sparsifying activation function, and formation of the random projection matrix. In this paper, we explore the effect of these choices on the performance of similarity search with FlyHash embeddings. We find that the right combination of design choices can lead to drastic difference in the search performance.
Authors
(none)
Tags
Stats
Related papers
- Analysis Of Sparsehash: An Efficient Embedding Of Set-similarity Via Sparse Projections (2019)4.52
- Improving Similarity Search With High-dimensional Locality-sensitive Hashing (2018)0.00
- Super-sparse Learning In Similarity Spaces (2017)5.24
- Understanding Sparse JL For Feature Hashing (2019)0.00
- Rethinking Similarity Search: Embracing Smarter Mechanisms Over Smarter Data (2023)0.00
- FLASH: Randomized Algorithms Accelerated Over CPU-GPU For Ultra-high Dimensional Similarity Search (2017)9.23
- Adaptive Prefiltering For High-dimensional Similarity Search: A Frequency-aware Approach (2025)0.00
- A Theoretical View On Sparsely Activated Networks (2022)0.00