Explainable Reinforcement Learning Via A Causal World Model
2023 Β· Zhongwei Yu, Jingqing Ruan, Dengpeng Xing
Abstract
Generating explanations for reinforcement learning (RL) is challenging as actions may produce long-term effects on the future. In this paper, we develop a novel framework for explainable RL by learning a causal world model without prior knowledge of the causal structure of the environment. The model captures the influence of actions, allowing us to interpret the long-term effects of actions through causal chains, which present how actions influence environmental variables and finally lead to rewards. Different from most explanatory models which suffer from low accuracy, our model remains accurate while improving explainability, making it applicable in model-based learning. As a result, we demonstrate that our causal model can serve as the bridge between explainability and learning.
Authors
(none)
Tags
Stats
Related papers
- Explainable Reinforcement Learning Through A Causal Lens (2019)16.69
- Learning Nonlinear Causal Reductions To Explain Reinforcement Learning Policies (2025)0.00
- Learning By Doing: An Online Causal Reinforcement Learning Framework With Causal-aware Policy (2024)1.56
- Causal State Distillation For Explainable Reinforcement Learning (2023)0.00
- Experiential Explanations For Reinforcement Learning (2022)2.26
- Explainable Reinforcement Learning Agents Using World Models (2025)0.00
- Why Online Reinforcement Learning Is Causal (2024)0.00
- Explainable Reinforcement Learning Via Model Transforms (2022)0.00