Evaluating Interpretable Reinforcement Learning By Distilling Policies Into Programs
2025 Β· Hector Kohler, Quentin Delfosse, Waris Radji, et al.
Abstract
There exist applications of reinforcement learning like medicine where policies need to be ''interpretable'' by humans. User studies have shown that some policy classes might be more interpretable than others. However, it is costly to conduct human studies of policy interpretability. Furthermore, there is no clear definition of policy interpretabiliy, i.e., no clear metrics for interpretability and thus claims depend on the chosen definition. We tackle the problem of empirically evaluating policies interpretability without humans. Despite this lack of clear definition, researchers agree on the notions of ''simulatability'': policy interpretability should relate to how humans understand policy actions given states. To advance research in interpretable reinforcement learning, we contribute a new methodology to evaluate policy interpretability. This new methodology relies on proxies for simulatability that we use to conduct a large-scale empirical evaluation of policy interpretability. We
Authors
(none)
Tags
Stats
Related papers
- A Survey On Interpretable Reinforcement Learning (2021)0.00
- "so, Tell Me About Your Policy...": Distillation Of Interpretable Policies From Deep Reinforcement Learning Agents (2025)0.00
- From Explainability To Interpretability: Interpretable Policies In Reinforcement Learning Via Model Explanation (2025)0.00
- Interpretable Policies For Reinforcement Learning By Genetic Programming (2017)14.76
- Interpretable Off-policy Evaluation In Reinforcement Learning By Highlighting Influential Transitions (2020)0.00
- Three Pathways To Neurosymbolic Reinforcement Learning With Interpretable Model And Policy Networks (2024)0.00
- Towards A Research Community In Interpretable Reinforcement Learning: The Interppol Workshop (2024)0.00
- S-REINFORCE: A Neuro-symbolic Policy Gradient Approach For Interpretable Reinforcement Learning (2023)0.00