Co-activation Graph Analysis Of Safety-verified And Explainable Deep Reinforcement Learning Policies
2025 Β· Dennis Gross, Helge Spieker
Abstract
Deep reinforcement learning (RL) policies can demonstrate unsafe behaviors and are challenging to interpret. To address these challenges, we combine RL policy model checking--a technique for determining whether RL policies exhibit unsafe behaviors--with co-activation graph analysis--a method that maps neural network inner workings by analyzing neuron activation patterns--to gain insight into the safe RL policy's sequential decision-making. This combination lets us interpret the RL policy's inner workings for safe decision-making. We demonstrate its applicability in various experiments.
Authors
(none)
Tags
Stats
Related papers
- Concurrent Learning Of Policy And Unknown Safety Constraints In Reinforcement Learning (2024)0.00
- Actsafe: Active Exploration With Safety Constraints For Reinforcement Learning (2024)0.00
- An Abstraction-based Method To Check Multi-agent Deep Reinforcement-learning Behaviors (2021)2.26
- Generation Of Policy-level Explanations For Reinforcement Learning (2019)11.39
- From Explainability To Interpretability: Interpretable Policies In Reinforcement Learning Via Model Explanation (2025)0.00
- How Do You Act? An Empirical Study To Understand Behavior Of Deep Reinforcement Learning Agents (2020)0.00
- Hierarchical Framework For Interpretable And Probabilistic Model-based Safe Reinforcement Learning (2023)0.00
- Reinforcement Learning With Adaptive Regularization For Safe Control Of Critical Systems (2024)0.00