Exploratory Not Explanatory: Counterfactual Analysis Of Saliency Maps For Deep Reinforcement Learning
2019 Β· Akanksha Atrey, Kaleigh Clary, David Jensen
Abstract
Saliency maps are frequently used to support explanations of the behavior of deep reinforcement learning (RL) agents. However, a review of how saliency maps are used in practice indicates that the derived explanations are often unfalsifiable and can be highly subjective. We introduce an empirical approach grounded in counterfactual reasoning to test the hypotheses generated from saliency maps and assess the degree to which they correspond to the semantics of RL environments. We use Atari games, a common benchmark for deep RL, to evaluate three types of saliency maps. Our results show the extent to which existing claims about Atari games can be evaluated and suggest that saliency maps are best viewed as an exploratory tool rather than an explanatory tool.
Authors
(none)
Tags
Stats
Related papers
- Benchmarking Perturbation-based Saliency Maps For Explaining Atari Agents (2021)0.00
- Visualizing And Understanding Atari Agents (2017)0.00
- Explain Your Move: Understanding Agent Actions Using Specific And Relevant Feature Attribution (2019)0.00
- Explaining Deep Reinforcement Learning Agents In The Atari Domain Through A Surrogate Model (2021)0.00
- Local And Global Explanations Of Agent Behavior: Integrating Strategy Summaries With Saliency Maps (2020)11.85
- Are Gradient-based Saliency Maps Useful In Deep Reinforcement Learning? (2020)0.00
- Counterfactual State Explanations For Reinforcement Learning Agents Via Generative Deep Learning (2021)13.23
- Machine Versus Human Attention In Deep Reinforcement Learning Tasks (2020)0.00