Proximal Policy Optimization Via Enhanced Exploration Efficiency
2020 Β· Junwei Zhang, Zhenghao Zhang, Shuai Han, et al.
Abstract
Proximal policy optimization (PPO) algorithm is a deep reinforcement learning algorithm with outstanding performance, especially in continuous control tasks. But the performance of this method is still affected by its exploration ability. For classical reinforcement learning, there are some schemes that make exploration more full and balanced with data exploitation, but they can't be applied in complex environments due to the complexity of algorithm. Based on continuous control tasks with dense reward, this paper analyzes the assumption of the original Gaussian action exploration mechanism in PPO algorithm, and clarifies the influence of exploration ability on performance. Afterward, aiming at the problem of exploration, an exploration enhancement mechanism based on uncertainty estimation is designed in this paper. Then, we apply exploration enhancement theory to PPO algorithm and propose the proximal policy optimization algorithm with intrinsic exploration module (IEM-PPO) which can b
Authors
(none)
Tags
Stats
Related papers
- Proximal Policy Optimization With Adaptive Exploration (2024)0.00
- Policy Optimization With Model-based Explorations (2018)5.84
- Truly Proximal Policy Optimization (2019)0.00
- Proximal Policy Optimization Algorithms (2017)0.00
- KIPPO: Koopman-inspired Proximal Policy Optimization (2025)0.00
- PPO-CMA: Proximal Policy Optimization With Covariance Matrix Adaptation (2018)0.00
- Revisiting Design Choices In Proximal Policy Optimization (2020)0.00
- Simple Policy Optimization (2024)0.00