Safe Reinforcement Learning In Black-box Environments Via Adaptive Shielding
2024 Β· Daniel Bethell, Simos Gerasimou, Radu Calinescu, et al.
Abstract
Empowering safe exploration of reinforcement learning (RL) agents during training is a critical challenge towards their deployment in many real-world scenarios. When prior knowledge of the domain or task is unavailable, training RL agents in unknown, black-box environments presents an even greater safety risk. We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training, and uses this knowledge to protect the RL agent from executing actions that yield likely hazardous outcomes. Our comprehensive experimental evaluation against state-of-the-art safe RL exploration techniques shows that ADVICE significantly reduces safety violations (approx 50%) during training, with a competitive outcome reward compared to other techniques.
Authors
(none)
Tags
Stats
Related papers
- Actsafe: Active Exploration With Safety Constraints For Reinforcement Learning (2024)0.00
- Reinforcement Learning By Guided Safe Exploration (2023)5.24
- Safe Reinforcement Learning With Dual Robustness (2023)8.60
- Probabilistic Counterexample Guidance For Safer Reinforcement Learning (extended Version) (2023)0.00
- On The Robustness Of Safe Reinforcement Learning Under Observational Perturbations (2022)0.00
- On Assessing The Safety Of Reinforcement Learning Algorithms Using Formal Methods (2021)0.00
- Reinforcement Learning With Adaptive Regularization For Safe Control Of Critical Systems (2024)0.00
- Safe-support Q-learning: Learning Without Unsafe Exploration (2026)0.00