On The Optimal Time/space Tradeoff For Hash Tables
2021 · Michael A. Bender, Martín Farach-Colton, John Kuszmaul, et al.
Abstract
For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are Theta(log n) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglog n) bits of space per key when compared to the information-theoretic optimum. Even prior to this bound being achieved, the target of O(loglog n) wasted bits per key was known to be a natural end goal, and was proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(loglog n) wasted bits per key is not the end of the line for hashing. In fact, for any k \in [log* n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and O(log^\{(k)\} n) wasted bits per key (all with high probability in n). This m
Authors
(none)
Tags
Stats
Related papers
- Optimal Hashing-based Time-space Trade-offs For Approximate Near Neighbors (2016)11.29
- Practical Hash Functions For Similarity Estimation And Dimensionality Reduction (2017)0.00
- On The Evaluation Metric For Hashing (2019)0.00
- Peeling Close To The Orientability Threshold: Spatial Coupling In Hashing-based Data Structures (2020)3.58
- Pqtable: Non-exhaustive Fast Search For Product-quantized Codes Using Hash Tables (2017)7.16
- Subsets And Supermajorities: Optimal Hashing-based Set Similarity Search (2019)5.84
- A Lower Bound Of Hash Codes' Performance (2022)1.56
- A Resource-frugal Probabilistic Dictionary And Applications In Bioinformatics (2017)9.41