Continuous Action Reinforcement Learning From A Mixture Of Interpretable Experts
2020 Β· Riad Akrour, Davide Tateo, Jan Peters
Abstract
Reinforcement learning (RL) has demonstrated its ability to solve high dimensional tasks by leveraging non-linear function approximators. However, these successes are mostly achieved by 'black-box' policies in simulated domains. When deploying RL to the real world, several concerns regarding the use of a 'black-box' policy might be raised. In order to make the learned policies more transparent, we propose in this paper a policy iteration scheme that retains a complex function approximator for its internal value predictions but constrains the policy to have a concise, hierarchical, and human-readable structure, based on a mixture of interpretable experts. Each expert selects a primitive action according to a distance to a prototypical state. A key design decision to keep such experts interpretable is to select the prototypical states from trajectory data. The main technical contribution of the paper is to address the challenges introduced by this non-differentiable prototypical state se
Authors
(none)
Tags
Stats
Related papers
- "so, Tell Me About Your Policy...": Distillation Of Interpretable Policies From Deep Reinforcement Learning Agents (2025)0.00
- Probabilistic Mixture-of-experts For Efficient Deep Reinforcement Learning (2021)0.00
- Experiential Explanations For Reinforcement Learning (2022)2.26
- Think Outside The Policy: In-context Steered Policy Optimization (2025)0.00
- Computationally Efficient Reinforcement Learning: Targeted Exploration Leveraging Simple Rules (2022)2.26
- Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning With Clairvoyant Experts (2020)0.00
- S-REINFORCE: A Neuro-symbolic Policy Gradient Approach For Interpretable Reinforcement Learning (2023)0.00
- Explaining Reinforcement Learning Policies Through Counterfactual Trajectories (2022)0.00