Probabilistic Mixture-of-experts For Efficient Deep Reinforcement Learning
2021 Β· Jie Ren, Yewen Li, Zihan Ding, et al.
Abstract
Deep reinforcement learning (DRL) has successfully solved various problems recently, typically with a unimodal policy representation. However, grasping distinguishable skills for some tasks with non-unique optima can be essential for further improving its learning efficiency and performance, which may lead to a multimodal policy represented as a mixture-of-experts (MOE). To our best knowledge, present DRL algorithms for general utility do not deploy this method as policy function approximators due to the potential challenge in its differentiability for policy learning. In this work, we propose a probabilistic mixture-of-experts (PMOE) implemented with a Gaussian mixture model (GMM) for multimodal policy, together with a novel gradient estimator for the indifferentiability problem, which can be applied in generic off-policy and on-policy DRL algorithms using stochastic policies, e.g., Soft Actor-Critic (SAC) and Proximal Policy Optimisation (PPO). Experimental results testify the advant
Authors
(none)
Tags
Stats
Related papers
- SPHERE: Mitigating The Loss Of Spectral Plasticity In Mixture-of-experts For Deep Reinforcement Learning (2026)0.00
- Merging Deterministic Policy Gradient Estimations With Varied Bias-variance Tradeoff For Effective Deep Reinforcement Learning (2019)0.00
- Continuous Action Reinforcement Learning From A Mixture Of Interpretable Experts (2020)0.00
- Contextual Policy Transfer In Reinforcement Learning Domains Via Deep Mixtures-of-experts (2020)0.00
- MEPG: A Minimalist Ensemble Policy Gradient Framework For Deep Reinforcement Learning (2021)0.00
- Dynamic Mixture Of Experts Against Severe Distribution Shifts (2025)0.00
- Double Reinforcement Learning For Efficient Off-policy Evaluation In Markov Decision Processes (2019)0.00
- Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning With Clairvoyant Experts (2020)0.00