Reinforcement Learning In Non-markovian Environments
2022 Β· Siddharth Chandak, Pratik Shah, Vivek S Borkar, et al.
Abstract
Motivated by the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitrary non-Markovian environments, we propose a related formulation and explicitly pin down the error caused by non-Markovianity of observations when the Q-learning algorithm is applied on this formulation. Based on this observation, we propose that the criterion for agent design should be to seek good approximations for certain conditional laws. Inspired by classical stochastic control, we show that our problem reduces to that of recursive computation of approximate sufficient statistics. This leads to an autoencoder-based scheme for agent design which is then numerically tested on partially observed reinforcement learning environments.
Authors
(none)
Tags
Stats
Related papers
- Partially Observable Mean Field Reinforcement Learning (2020)0.00
- Learning To Steer Markovian Agents Under Model Uncertainty (2024)0.00
- Reinforcement Learning Under Partial Observability Guided By Learned Environment Models (2022)6.34
- Model-agnostic Solutions For Deep Reinforcement Learning In Non-ergodic Contexts (2026)0.00
- Simple Agent, Complex Environment: Efficient Reinforcement Learning With Agent States (2021)0.00
- Multi-agent Off-policy Actor-critic Reinforcement Learning For Partially Observable Environments (2024)2.26
- Optimal Decision-making In Mixed-agent Partially Observable Stochastic Environments Via Reinforcement Learning (2019)0.00
- An Information-theoretic Optimality Principle For Deep Reinforcement Learning (2017)0.00