Model-enhanced Contrastive Reinforcement Learning For Sequential Recommendation
2023 Β· Chengpeng Li, Zhengyi Yang, Jizhi Zhang, et al.
Abstract
Reinforcement learning (RL) has been widely applied in recommendation systems due to its potential in optimizing the long-term engagement of users. From the perspective of RL, recommendation can be formulated as a Markov decision process (MDP), where recommendation system (agent) can interact with users (environment) and acquire feedback (reward signals).However, it is impractical to conduct online interactions with the concern on user experience and implementation complexity, and we can only train RL recommenders with offline datasets containing limited reward signals and state transitions. Therefore, the data sparsity issue of reward signals and state transitions is very severe, while it has long been overlooked by existing RL recommenders.Worse still, RL methods learn through the trial-and-error mode, but negative feedback cannot be obtained in implicit feedback recommendation tasks, which aggravates the overestimation problem of offline RL recommender. To address these challenges,
Authors
(none)
Tags
Stats
Related papers
- Contrastive UCB: Provably Efficient Contrastive Self-supervised Learning In Online Reinforcement Learning (2022)0.00
- Environment Reconstruction With Hidden Confounders For Reinforcement Learning Based Recommendation (2019)11.93
- Accelerating Offline Reinforcement Learning Application In Real-time Bidding And Recommendation: Potential Use Of Simulation (2021)0.00
- Edge-compatible Reinforcement Learning For Recommendations (2021)0.00
- Bridging The Gap Between Offline And Online Reinforcement Learning Evaluation Methodologies (2022)0.00
- Morel : Model-based Offline Reinforcement Learning (2020)0.00
- Overcoming Model Bias For Robust Offline Deep Reinforcement Learning (2020)11.58
- Contrastive Diffuser: Planning Towards High Return States Via Contrastive Learning (2024)0.00