Causal State Distillation For Explainable Reinforcement Learning
2023 Β· Wenhao Lu, Xufeng Zhao, Thilo Fryen, et al.
Abstract
Reinforcement learning (RL) is a powerful technique for training intelligent agents, but understanding why these agents make specific decisions can be quite challenging. This lack of transparency in RL models has been a long-standing problem, making it difficult for users to grasp the reasons behind an agent's behaviour. Various approaches have been explored to address this problem, with one promising avenue being reward decomposition (RD). RD is appealing as it sidesteps some of the concerns associated with other methods that attempt to rationalize an agent's behaviour in a post-hoc manner. RD works by exposing various facets of the rewards that contribute to the agent's objectives during training. However, RD alone has limitations as it primarily offers insights based on sub-rewards and does not delve into the intricate cause-and-effect relationships that occur within an RL agent's neural model. In this paper, we present an extension of RD that goes beyond sub-rewards to provide more
Authors
(none)
Tags
Stats
Related papers
- Reccover: Detecting Causal Confusion For Explainable Reinforcement Learning (2022)0.00
- Explainable Reinforcement Learning Via A Causal World Model (2023)9.03
- Experiential Explanations For Reinforcement Learning (2022)2.26
- A Survey On Explainable Reinforcement Learning: Concepts, Algorithms, Challenges (2022)0.00
- Explainable Reinforcement Learning Via Model Transforms (2022)0.00
- Explainability In Deep Reinforcement Learning (2020)0.00
- Contrastive Explanations For Reinforcement Learning In Terms Of Expected Consequences (2018)0.00
- Explainable Reinforcement Learning Through A Causal Lens (2019)16.69