Learning Without Knowing: Unobserved Context In Continuous Transfer Reinforcement Learning
2021 Β· Chenyu Liu, Yan Zhang, Yi Shen, et al.
Abstract
In this paper, we consider a transfer Reinforcement Learning (RL) problem in continuous state and action spaces, under unobserved contextual information. For example, the context can represent the mental view of the world that an expert agent has formed through past interactions with this world. We assume that this context is not accessible to a learner agent who can only observe the expert data. Then, our goal is to use the context-aware expert data to learn an optimal context-unaware policy for the learner using only a few new data samples. Such problems are typically solved using imitation learning that assumes that both the expert and learner agents have access to the same information. However, if the learner does not know the expert context, using the expert data alone will result in a biased learner policy and will require many new data samples to improve. To address this challenge, in this paper, we formulate the learning problem as a causal bound-constrained Multi-Armed-Bandit
Authors
(none)
Tags
Stats
Related papers
- Contextual Intelligence The Next Leap For Reinforcement Learning (2026)0.00
- Reinforcement Learning In Presence Of Discrete Markovian Context Evolution (2022)0.00
- Online Reinforcement Learning In Non-stationary Context-driven Environments (2023)0.00
- Contextual Policy Transfer In Reinforcement Learning Domains Via Deep Mixtures-of-experts (2020)0.00
- Contextual Bandits And Optimistically Universal Learning (2022)0.00
- Contextualize Me -- The Case For Context In Reinforcement Learning (2022)0.00
- Statistical Context Detection For Deep Lifelong Reinforcement Learning (2024)0.00
- Reinforcement Learning With Continuous Actions Under Unmeasured Confounding (2025)0.00