Mixed-precision Embeddings For Large-scale Recommendation Models
2024 · Shiwei Li, Zhuoqi Hu, Xing Tang, et al.
Abstract
Embedding techniques have become essential components of large databases in the deep learning era. By encoding discrete entities, such as words, items, or graph nodes, into continuous vector spaces, embeddings facilitate more efficient storage, retrieval, and processing in large databases. Especially in the domain of recommender systems, millions of categorical features are encoded as unique embedding vectors, which facilitates the modeling of similarities and interactions among features. However, numerous embedding vectors can result in significant storage overhead. In this paper, we aim to compress the embedding table through quantization techniques. Given that features vary in importance levels, we seek to identify an appropriate precision for each feature to balance model accuracy and memory usage. To this end, we propose a novel embedding compression method, termed Mixed-Precision Embeddings (MPE). Specifically, to reduce the size of the search space, we first group features by fr
Authors
(none)
Tags
Stats
Related papers
- Mem-rec: Memory Efficient Recommendation System Using Alternative Representation (2023)0.00
- Multi-probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions And Enhancing Model Freshness In Large-scale Recommenders (2026)0.00
- Embedding In Recommender Systems: A Survey (2023)0.00
- CAFE: Towards Compact, Adaptive, And Fast Embedding For Large-scale Recommendation Models (2023)8.09
- Semantically Constrained Memory Allocation (SCMA) For Embedding In Efficient Recommendation Systems (2021)0.00
- Fine-grained Embedding Dimension Optimization During Training For Recommender Systems (2024)0.00
- Learning Compact Compositional Embeddings Via Regularized Pruning For Recommendation (2023)8.36
- Optimization Of Embeddings Storage For RAG Systems Using Quantization And Dimensionality Reduction Techniques (2025)0.00