Explaining Reinforcement Learning With Shapley Values
2023 · Daniel Beechey, Thomas M. S. Smith, Özgür Şimşek
Abstract
For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.
Authors
(none)
Tags
Stats
Related papers
- A Theoretical Framework For Explaining Reinforcement Learning With Shapley Values (2025)0.00
- Collective Explainable AI: Explaining Cooperative Strategies And Agent Contribution In Multiagent Reinforcement Learning With Shapley Values (2021)0.00
- Explaining Reinforcement Learning: A Counterfactual Shapley Values Approach (2024)0.00
- The Shapley Value In Machine Learning (2022)17.35
- SHAQ: Incorporating Shapley Value Theory Into Multi-agent Q-learning (2021)0.00
- From Explainability To Interpretability: Interpretable Policies In Reinforcement Learning Via Model Explanation (2025)0.00
- Explainability In Deep Reinforcement Learning, A Review Into Current Methods And Applications (2022)12.33
- Efficiently Quantifying Individual Agent Importance In Cooperative MARL (2023)0.00