Reinforcement Learning In Presence Of Discrete Markovian Context Evolution
2022 Β· Hang Ren, Aivar Sootla, Taher Jafferjee, et al.
Abstract
We consider a context-dependent Reinforcement Learning (RL) setting, which is characterized by: a) an unknown finite number of not directly observable contexts; b) abrupt (discontinuous) context changes occurring during an episode; and c) Markovian context evolution. We argue that this challenging case is often met in applications and we tackle it using a Bayesian approach and variational inference. We adapt a sticky Hierarchical Dirichlet Process (HDP) prior for model learning, which is arguably best-suited for Markov process modeling. We then derive a context distillation procedure, which identifies and removes spurious contexts in an unsupervised fashion. We argue that the combination of these two components allows to infer the number of contexts from data thus dealing with the context cardinality assumption. We then find the representation of the optimal policy enabling efficient policy learning using off-the-shelf RL algorithms. Finally, we demonstrate empirically (using gym envir
Authors
(none)
Tags
Stats
Related papers
- Online Reinforcement Learning In Non-stationary Context-driven Environments (2023)0.00
- No-regret Exploration In Contextual Reinforcement Learning (2019)0.00
- Inverse Reinforcement Learning In Contextual Mdps (2019)8.82
- Dynamics-adaptive Continual Reinforcement Learning Via Progressive Contextualization (2022)7.16
- Contextual Intelligence The Next Leap For Reinforcement Learning (2026)0.00
- Learning Without Knowing: Unobserved Context In Continuous Transfer Reinforcement Learning (2021)0.00
- Demystifying Reinforcement Learning In Time-varying Systems (2022)0.00
- Efficient Off-policy Meta-reinforcement Learning Via Probabilistic Context Variables (2019)0.00