Contrastive Explanations For Reinforcement Learning In Terms Of Expected Consequences
2018 Β· Jasper van Der Waa, Jurriaan van Diggelen, Karel van Den Bosch, et al.
Abstract
Machine Learning models become increasingly proficient in complex tasks. However, even for experts in the field, it can be difficult to understand what the model learned. This hampers trust and acceptance, and it obstructs the possibility to correct the model. There is therefore a need for transparency of machine learning models. The development of transparent classification models has received much attention, but there are few developments for achieving transparent Reinforcement Learning (RL) models. In this study we propose a method that enables a RL agent to explain its behavior in terms of the expected consequences of state transitions and outcomes. First, we define a translation of states and actions to a description that is easier to understand for human users. Second, we developed a procedure that enables the agent to obtain the consequences of a single action, as well as its entire policy. The method calculates contrasts between the consequences of a policy derived from a user
Authors
(none)
Tags
Stats
Related papers
- Experiential Explanations For Reinforcement Learning (2022)2.26
- Why The Agent Made That Decision: Contrastive Explanation Learning For Reinforcement Learning (2024)0.00
- (when) Are Contrastive Explanations Of Reinforcement Learning Helpful? (2022)0.00
- Explainable Reinforcement Learning Via Model Transforms (2022)0.00
- Redefining Counterfactual Explanations For Reinforcement Learning: Overview, Challenges And Opportunities (2022)0.00
- Causal State Distillation For Explainable Reinforcement Learning (2023)0.00
- Explaining Conditions For Reinforcement Learning Behaviors From Real And Imagined Data (2020)0.00
- A Survey On Explainable Reinforcement Learning: Concepts, Algorithms, Challenges (2022)0.00