Tokenrec: Learning To Tokenize ID For Llm-based Generative Recommendation
2024 Β· Haohao Qu, Wenqi Fan, Zihuai Zhao, et al.
Abstract
There is a growing interest in utilizing large-scale language models (LLMs) to advance next-generation Recommender Systems (RecSys), driven by their outstanding language understanding and in-context learning capabilities. In this scenario, tokenizing (i.e., indexing) users and items becomes essential for ensuring a seamless alignment of LLMs with recommendations. While several studies have made progress in representing users and items through textual contents or latent representations, challenges remain in efficiently capturing high-order collaborative knowledge into discrete tokens that are compatible with LLMs. Additionally, the majority of existing tokenization approaches often face difficulties in generalizing effectively to new/unseen users or items that were not in the training corpus. To address these challenges, we propose a novel framework called TokenRec, which introduces not only an effective ID tokenization strategy but also an efficient retrieval paradigm for LLM-based rec
Authors
(none)
Tags
Stats
Related papers
- Unified Semantic And ID Representation Learning For Deep Recommenders (2025)0.00
- Learning To Tokenize For Generative Retrieval (2023)4.52
- Notellm: A Retrievable Large Language Model For Note Recommendation (2024)9.41
- Itemrag: Item-based Retrieval-augmented Generation For Llm-based Recommendation (2025)1.20
- PLUM: Adapting Pre-trained Language Models For Industrial-scale Generative Recommendations (2025)2.26
- Cost: Contrastive Quantization Based Semantic Tokenization For Generative Recommendation (2024)7.81
- LMAR: Language Model Augmented Retriever For Domain-specific Knowledge Indexing (2025)1.57
- Bottleneck Tokens For Unified Multimodal Retrieval (2026)0.00