Implicit Policy For Reinforcement Learning
2018 Β· Yunhao Tang, Shipra Agrawal
Abstract
We introduce Implicit Policy, a general class of expressive policies that can flexibly represent complex action distributions in reinforcement learning, with efficient algorithms to compute entropy regularized policy gradients. We empirically show that, despite its simplicity in implementation, entropy regularization combined with a rich policy class can attain desirable properties displayed under maximum entropy reinforcement learning framework, such as robustness and multi-modality.
Authors
(none)
Tags
Stats
Related papers
- An Entropy Regularization Free Mechanism For Policy-based Reinforcement Learning (2021)0.00
- Off-policy Maximum Entropy RL With Future State And Action Visitation Measures (2024)0.00
- Policy Optimization Reinforcement Learning With Entropy Regularization (2019)0.00
- Understanding The Impact Of Entropy On Policy Optimization (2018)0.00
- Arbitrary Entropy Policy Optimization Breaks The Exploration Bottleneck Of Reinforcement Learning (2025)0.00
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00
- Increasing Entropy To Boost Policy Gradient Performance On Personalization Tasks (2023)0.00
- Marginalized State Distribution Entropy Regularization In Policy Optimization (2019)0.00