Mitigating Information Loss In Tree-based Reinforcement Learning Via Direct Optimization
2024 Β· Sascha Marton, Tim Grams, Florian Vogt, et al.
Abstract
Reinforcement learning (RL) has seen significant success across various domains, but its adoption is often limited by the black-box nature of neural network policies, making them difficult to interpret. In contrast, symbolic policies allow representing decision-making strategies in a compact and interpretable way. However, learning symbolic policies directly within on-policy methods remains challenging. In this paper, we introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL. SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions while maintaining a high level of interpretability. We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches in terms of performance and interpretability. Unlike existing methods, it enables gradient-based, end-to-end learning of interpretable, axis-aligned decision trees within standard on-policy RL algorithms
Authors
(none)
Tags
Stats
Related papers
- Optimizing Interpretable Decision Tree Policies For Reinforcement Learning (2024)0.00
- S-REINFORCE: A Neuro-symbolic Policy Gradient Approach For Interpretable Reinforcement Learning (2023)0.00
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00
- Conservative Optimistic Policy Optimization Via Multiple Importance Sampling (2021)0.00
- Upside-down Reinforcement Learning For More Interpretable Optimal Control (2024)0.00
- Interpretable Local Tree Surrogate Policies (2021)0.00
- "so, Tell Me About Your Policy...": Distillation Of Interpretable Policies From Deep Reinforcement Learning Agents (2025)0.00
- Toward Interpretable Deep Reinforcement Learning With Linear Model U-trees (2018)13.05