Cost: Contrastive Quantization Based Semantic Tokenization For Generative Recommendation
2024 Β· Jieming Zhu, Mengqun Jin, Qijiong Liu, et al.
Abstract
Embedding-based retrieval serves as a dominant approach to candidate item matching for industrial recommender systems. With the success of generative AI, generative retrieval has recently emerged as a new retrieval paradigm for recommendation, which casts item retrieval as a generation problem. Its model consists of two stages: semantic tokenization and autoregressive generation. The first stage involves item tokenization that constructs discrete semantic tokens to index items, while the second stage autoregressively generates semantic tokens of candidate items. Therefore, semantic tokenization serves as a crucial preliminary step for training generative recommendation models. Existing research usually employs a vector quantizier with reconstruction loss (e.g., RQ-VAE) to obtain semantic tokens of items, but this method fails to capture the essential neighborhood relationships that are vital for effective item modeling in recommender systems. In this paper, we propose a contrastive qua
Authors
(none)
Tags
Stats
Related papers
- Generative Retrieval With Semantic Tree-structured Item Identifiers Via Contrastive Learning (2023)4.52
- Breaking The Hourglass Phenomenon Of Residual Quantization: Enhancing The Upper Bound Of Generative Retrieval (2024)4.52
- Onepiece: The Great Route To Generative Recommendation -- A Case Study From Tencent Algorithm Competition (2025)0.00
- Better Generalization With Semantic Ids: A Case Study In Ranking For Recommendations (2023)10.35
- Unified Semantic And ID Representation Learning For Deep Recommenders (2025)0.00
- Real-time Indexing For Large-scale Recommendation By Streaming Vector Quantization Retriever (2025)2.26
- Grank: Towards Target-aware And Streamlined Industrial Retrieval With A Generate-rank Framework (2025)0.00
- Domain-adaptive And Scalable Dense Retrieval For Content-based Recommendation (2026)0.00