Criticality And Safety Margins For Reinforcement Learning
2024 Β· Alexander Grushin, Walt Woods, Alvaro Velasquez, et al.
Abstract
State of the art reinforcement learning methods sometimes encounter unsafe situations. Identifying when these situations occur is of interest both for post-hoc analysis and during deployment, where it might be advantageous to call out to a human overseer for help. Efforts to gauge the criticality of different points in time have been developed, but their accuracy is not well established due to a lack of ground truth, and they are not designed to be easily interpretable by end users. Therefore, we seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality. Safety margins make these interpretable, when defined as the number of random actions for which performance l
Authors
(none)
Tags
Stats
Related papers
- Provably Optimal Reinforcement Learning Under Safety Filtering (2025)0.00
- On Assessing The Safety Of Reinforcement Learning Algorithms Using Formal Methods (2021)0.00
- Towards Safe Reinforcement Learning Via Constraining Conditional Value-at-risk (2022)0.00
- Efficient Policy Evaluation With Safety Constraint For Reinforcement Learning (2024)0.00
- Context-aware Safe Reinforcement Learning For Non-stationary Environments (2021)9.76
- Implicit Safe Set Algorithm For Provably Safe Reinforcement Learning (2024)0.00
- Extreme Risk Mitigation In Reinforcement Learning Using Extreme Value Theory (2023)0.00
- Criticality-based Varying Step-number Algorithm For Reinforcement Learning (2022)0.00