Is Model Ensemble Necessary? Model-based RL Via A Single Model With Lipschitz Regularized Value Function
2023 Β· Ruijie Zheng, Xiyao Wang, Huazhe Xu, et al.
Abstract
Probabilistic dynamics model ensemble is widely used in existing model-based reinforcement learning methods as it outperforms a single dynamics model in both asymptotic performance and sample efficiency. In this paper, we provide both practical and theoretical insights on the empirical success of the probabilistic dynamics model ensemble through the lens of Lipschitz continuity. We find that, for a value function, the stronger the Lipschitz condition is, the smaller the gap between the true dynamics- and learned dynamics-induced Bellman operators is, thus enabling the converged value function to be closer to the optimal value function. Hence, we hypothesize that the key functionality of the probabilistic dynamics model ensemble is to regularize the Lipschitz condition of the value function using generated samples. To test this hypothesis, we devise two practical robust training mechanisms through computing the adversarial noise and regularizing the value network's spectral norm to dire
Authors
(none)
Tags
Stats
Related papers
- Deep Reinforcement Learning In A Handful Of Trials Using Probabilistic Dynamics Models (2018)0.00
- Model-based Value Estimation For Efficient Model-free Reinforcement Learning (2018)0.00
- Sample Complexity Of Reinforcement Learning Using Linearly Combined Model Ensembles (2019)0.00
- Sample-efficient Reinforcement Learning With Stochastic Ensemble Value Expansion (2018)0.00
- Deep Autoregressive Density Nets Vs Neural Ensembles For Model-based Offline Reinforcement Learning (2024)0.00
- Diminishing Return Of Value Expansion Methods In Model-based Reinforcement Learning (2023)0.00
- On The Model-based Stochastic Value Gradient For Continuous Reinforcement Learning (2020)0.00
- Model-agnostic Solutions For Deep Reinforcement Learning In Non-ergodic Contexts (2026)0.00