Safe Continual Reinforcement Learning Methods For Nonstationary Environments. Towards A Survey Of The State Of The Art
2026 Β· Timofey Tomashevskiy
Abstract
This work provides a state-of-the-art survey of continual safe online reinforcement learning (COSRL) methods. We discuss theoretical aspects, challenges, and open questions in building continual online safe reinforcement learning algorithms. We provide the taxonomy and the details of continual online safe reinforcement learning methods based on the type of safe learning mechanism that takes adaptation to nonstationarity into account. We categorize safety constraints formulation for online reinforcement learning algorithms, and finally, we discuss prospects for creating reliable, safe online learning algorithms. Keywords: safe RL in nonstationary environments, safe continual reinforcement learning under nonstationarity, HM-MDP, NSMDP, POMDP, safe POMDP, constraints for continual learning, safe continual reinforcement learning review, safe continual reinforcement learning survey, safe continual reinforcement learning, safe online learning under distribution shift, safe continual online
Authors
(none)
Tags
Stats
Related papers
- Safe Continual Reinforcement Learning In Non-stationary Environments (2026)12.89
- Context-aware Safe Reinforcement Learning For Non-stationary Environments (2021)9.76
- A Survey Of Continual Reinforcement Learning (2025)0.00
- Online Reinforcement Learning In Non-stationary Context-driven Environments (2023)0.00
- DOPE: Doubly Optimistic And Pessimistic Exploration For Safe Reinforcement Learning (2021)0.00
- Constraints Penalized Q-learning For Safe Offline Reinforcement Learning (2021)0.00
- Omnisafe: An Infrastructure For Accelerating Safe Reinforcement Learning Research (2023)0.00
- Safe Reinforcement Learning For Constrained Markov Decision Processes With Stochastic Stopping Time (2024)2.26