Environment Reconstruction With Hidden Confounders For Reinforcement Learning Based Recommendation
2019 Β· Wenjie Shang, Yang Yu, Qingyang Li, et al.
Abstract
Reinforcement learning aims at searching the best policy model for decision making, and has been shown powerful for sequential recommendations. The training of the policy by reinforcement learning, however, is placed in an environment. In many real-world applications, however, the policy training in the real environment can cause an unbearable cost, due to the exploration in the environment. Environment reconstruction from the past data is thus an appealing way to release the power of reinforcement learning in these applications. The reconstruction of the environment is, basically, to extract the casual effect model from the data. However, real-world applications are often too complex to offer fully observable environment information. Therefore, quite possibly there are unobserved confounding variables lying behind the data. The hidden confounder can obstruct an effective reconstruction of the environment. In this paper, by treating the hidden confounder as a hidden policy, we propose
Authors
(none)
Tags
Stats
Related papers
- Model-enhanced Contrastive Reinforcement Learning For Sequential Recommendation (2023)0.00
- Characterizing Policy Divergence For Personalized Meta-reinforcement Learning (2020)0.00
- Learning Impartial Policies For Sequential Counterfactual Explanations Using Deep Reinforcement Learning (2023)0.00
- Counterfactual Experience Augmented Off-policy Reinforcement Learning (2025)0.00
- Confounding-robust Policy Evaluation In Infinite-horizon Reinforcement Learning (2020)0.00
- Causal Deep Reinforcement Learning Using Observational Data (2022)5.84
- Learning Nonlinear Causal Reductions To Explain Reinforcement Learning Policies (2025)0.00
- Blessing From Human-ai Interaction: Super Reinforcement Learning In Confounded Environments (2022)0.00