Aggressive Q-learning With Ensembles: Achieving Both High Sample Efficiency And High Asymptotic Performance
2021 Β· Yanqiu Wu, Xinyue Chen, Che Wang, et al.
Abstract
Recent advances in model-free deep reinforcement learning (DRL) show that simple model-free methods can be highly effective in challenging high-dimensional continuous control tasks. In particular, Truncated Quantile Critics (TQC) achieves state-of-the-art asymptotic training performance on the MuJoCo benchmark with a distributional representation of critics; and Randomized Ensemble Double Q-Learning (REDQ) achieves high sample efficiency that is competitive with state-of-the-art model-based methods using a high update-to-data ratio and target randomization. In this paper, we propose a novel model-free algorithm, Aggressive Q-Learning with Ensembles (AQE), which improves the sample-efficiency performance of REDQ and the asymptotic performance of TQC, thereby providing overall state-of-the-art performance during all stages of training. Moreover, AQE is very simple, requiring neither distributional representation of critics nor target randomization. The effectiveness of AQE is further sup
Authors
(none)
Tags
Stats
Related papers
- Dropout Q-functions For Doubly Efficient Reinforcement Learning (2021)0.00
- Provably Efficient And Agile Randomized Q-learning (2025)0.00
- Crossq: Batch Normalization In Deep Reinforcement Learning For Greater Sample Efficiency And Simplicity (2019)0.00
- Directional Ensemble Aggregation For Actor-critics (2025)0.00
- Effective Exploration For Deep Reinforcement Learning Via Bootstrapped Q-ensembles Under Tsallis Entropy Regularization (2018)0.00
- Enhancing Sample Efficiency In Multi-agent RL With Uncertainty Quantification And Selective Exploration (2025)0.00
- Towards Applicable Reinforcement Learning: Improving The Generalization And Sample Efficiency With Policy Ensemble (2022)9.23
- SPEQ: Offline Stabilization Phases For Efficient Q-learning In High Update-to-data Ratio Reinforcement Learning (2025)0.00