Decoupling Regularization From The Action Space
2024 Β· Sobhan Mohammadpour, Emma Frejinger, Pierre-Luc Bacon
Abstract
Regularized reinforcement learning (RL), particularly the entropy-regularized kind, has gained traction in optimal control and inverse RL. While standard unregularized RL methods remain unaffected by changes in the number of actions, we show that it can severely impact their regularized counterparts. This paper demonstrates the importance of decoupling the regularizer from the action space: that is, to maintain a consistent level of regularization regardless of how many actions are involved to avoid over-regularization. Whereas the problem can be avoided by introducing a task-specific temperature parameter, it is often undesirable and cannot solve the problem when action spaces are state-dependent. In the state-dependent action context, different states with varying action spaces are regularized inconsistently. We introduce two solutions: a static temperature selection approach and a dynamic counterpart, universally applicable where this problem arises. Implementing these changes impro
Authors
(none)
Tags
Stats
Related papers
- Regularization Matters In Policy Optimization (2019)2.68
- Entropy Regularized Reinforcement Learning Using Large Deviation Theory (2021)6.34
- Optimal Scheduling Of Entropy Regulariser For Continuous-time Linear-quadratic Reinforcement Learning (2022)4.52
- A Comparative Theoretical Analysis Of Entropy Control Methods In Reinforcement Learning (2026)0.00
- Mutual-information Regularization In Markov Decision Processes And Actor-critic Learning (2019)0.00
- Statistical Analysis Of Inverse Entropy-regularized Reinforcement Learning (2025)0.00
- Transfer RL Across Observation Feature Spaces Via Model-based Regularization (2022)0.00
- Exploration Versus Exploitation In Reinforcement Learning: A Stochastic Control Approach (2018)9.76