Reconstructing Actions To Explain Deep Reinforcement Learning
2020 Β· Xuan Chen, Zifan Wang, Yucai Fan, et al.
Abstract
Feature attribution has been a foundational building block for explaining the input feature importance in supervised learning with Deep Neural Network (DNNs), but face new challenges when applied to deep Reinforcement Learning (RL).We propose a new approach to explaining deep RL actions by defining a class of *action reconstruction* functions that mimic the behavior of a network in deep RL. This approach allows us to answer more complex explainability questions than direct application of DNN attribution methods, which we adapt to *behavior-level attributions* in building our action reconstructions. It also allows us to define *agreement*, a metric for quantitatively evaluating the explainability of our methods. Our experiments on a variety of Atari games suggest that perturbation-based attribution methods are significantly more suitable in reconstructing actions to explain the deep RL agent than alternative attribution methods, and show greater *agreement* than existing explainability
Authors
(none)
Tags
Stats
Related papers
- Explain Your Move: Understanding Agent Actions Using Specific And Relevant Feature Attribution (2019)0.00
- Explaining Deep Reinforcement Learning Agents In The Atari Domain Through A Surrogate Model (2021)0.00
- Experiential Explanations For Reinforcement Learning (2022)2.26
- Explainability In Deep Reinforcement Learning (2020)0.00
- Why The Agent Made That Decision: Contrastive Explanation Learning For Reinforcement Learning (2024)0.00
- Generating Explanations From Deep Reinforcement Learning Using Episodic Memory (2022)0.00
- How Do You Act? An Empirical Study To Understand Behavior Of Deep Reinforcement Learning Agents (2020)0.00
- A Snapshot Of Influence: A Local Data Attribution Framework For Online Reinforcement Learning (2025)0.00