Model-based Reinforcement Learning Under Random Observation Delays
2025 Β· Armin Karamzade, Kyungmin Kim, Jb Lanier, et al.
Abstract
Delays frequently occur in real-world environments, yet standard reinforcement learning (RL) algorithms often assume instantaneous perception of the environment. We study random sensor delays in POMDPs, where observations may arrive out-of-sequence, a setting that has not been previously addressed in RL. We analyze the structure of such delays and demonstrate that naive approaches, such as stacking past observations, are insufficient for reliable performance. To address this, we propose a model-based filtering process that sequentially updates the belief state based on an incoming stream of observations. We then introduce a simple delay-aware framework that incorporates this idea into model-based RL, enabling agents to effectively handle random delays. Applying this framework to the Dreamer world-modeling scheme, our method consistently outperforms delay-aware baselines developed for MDPs and demonstrates robustness to delay distribution shifts during deployment. Additionally, we prese
Authors
(none)
Tags
Stats
Related papers
- Revisiting State Augmentation Methods For Reinforcement Learning With Stochastic Delays (2021)10.35
- Reinforcement Learning With Random Delays (2020)0.00
- Reinforcement Learning For Control Systems With Time Delays: A Comprehensive Survey (2026)0.00
- Reinforcement Learning Via Conservative Agent For Environments With Random Delays (2025)0.00
- Blind Decision Making: Reinforcement Learning With Delayed Observations (2020)0.00
- Delay-aware Multi-agent Reinforcement Learning For Cooperative And Competitive Environments (2020)0.00
- Boosting Reinforcement Learning With Strongly Delayed Feedback Through Auxiliary Short Delays (2024)1.69
- Effective Multi-user Delay-constrained Scheduling With Deep Recurrent Reinforcement Learning (2022)7.16