Counterfactual State Explanations For Reinforcement Learning Agents Via Generative Deep Learning
2021 Β· Matthew L. Olson, Roli Khanna, Lawrence Neal, et al.
Abstract
Counterfactual explanations, which deal with "why not?" scenarios, can provide insightful explanations to an AI agent's behavior. In this work, we focus on generating counterfactual explanations for deep reinforcement learning (RL) agents which operate in visual input environments like Atari. We introduce counterfactual state explanations, a novel example-based approach to counterfactual explanations based on generative deep learning. Specifically, a counterfactual state illustrates what minimal change is needed to an Atari game image such that the agent chooses a different action. We also evaluate the effectiveness of counterfactual states on human participants who are not machine learning experts. Our first user study investigates if humans can discern if the counterfactual state explanations are produced by the actual game or produced by a generative deep learning approach. Our second user study investigates if counterfactual state explanations can help non-expert participants ident
Authors
(none)
Tags
Stats
Related papers
- Counterfactual States For Atari Agents Via Generative Deep Learning (2019)0.00
- Ganterfactual-rl: Understanding Reinforcement Learning Agents' Strategies Through Visual Counterfactual Explanations (2023)2.26
- Redefining Counterfactual Explanations For Reinforcement Learning: Overview, Challenges And Opportunities (2022)0.00
- Experiential Explanations For Reinforcement Learning (2022)2.26
- RACCER: Towards Reachable And Certain Counterfactual Explanations For Reinforcement Learning (2023)0.00
- Explaining Deep Reinforcement Learning Agents In The Atari Domain Through A Surrogate Model (2021)0.00
- Why The Agent Made That Decision: Contrastive Explanation Learning For Reinforcement Learning (2024)0.00
- Learning Impartial Policies For Sequential Counterfactual Explanations Using Deep Reinforcement Learning (2023)0.00