Understanding The Impact Of Entropy On Policy Optimization
2018 Β· Zafarali Ahmed, Nicolas Le Roux, Mohammad Norouzi, et al.
Abstract
Entropy regularization is commonly used to improve policy optimization in reinforcement learning. It is believed to help with *exploration* by encouraging the selection of more stochastic policies. In this work, we analyze this claim using new visualizations of the optimization landscape based on randomly perturbing the loss function. We first show that even with access to the exact gradient, policy optimization is difficult due to the geometry of the objective function. Then, we qualitatively show that in some environments, a policy with higher entropy can make the optimization landscape smoother, thereby connecting local optima and enabling the use of larger learning rates. This paper presents new tools for understanding the optimization landscape, shows that policy entropy serves as a regularizer, and highlights the challenge of designing general-purpose policy optimization algorithms.
Authors
(none)
Tags
Stats
Related papers
- Marginalized State Distribution Entropy Regularization In Policy Optimization (2019)0.00
- Policy Optimization Reinforcement Learning With Entropy Regularization (2019)0.00
- Arbitrary Entropy Policy Optimization Breaks The Exploration Bottleneck Of Reinforcement Learning (2025)0.00
- Increasing Entropy To Boost Policy Gradient Performance On Personalization Tasks (2023)0.00
- Examining Policy Entropy Of Reinforcement Learning Agents For Personalization Tasks (2022)0.00
- Complexity-regularized Proximal Policy Optimization (2025)0.00
- Beyond Exact Gradients: Convergence Of Stochastic Soft-max Policy Gradient Methods With Entropy Regularization (2021)2.26
- Do You Need The Entropy Reward (in Practice)? (2022)0.00