Policy Augmentation: An Exploration Strategy For Faster Convergence Of Deep Reinforcement Learning Algorithms
2021 Β· Arash Mahyari
Abstract
Despite advancements in deep reinforcement learning algorithms, developing an effective exploration strategy is still an open problem. Most existing exploration strategies either are based on simple heuristics, or require the model of the environment, or train additional deep neural networks to generate imagination-augmented paths. In this paper, a revolutionary algorithm, called Policy Augmentation, is introduced. Policy Augmentation is based on a newly developed inductive matrix completion method. The proposed algorithm augments the values of unexplored state-action pairs, helping the agent take actions that will result in high-value returns while the agent is in the early episodes. Training deep reinforcement learning algorithms with high-value rollouts leads to the faster convergence of deep reinforcement learning algorithms. Our experiments show the superior performance of Policy Augmentation. The code can be found at: https://github.com/arashmahyari/PolicyAugmentation.
Authors
(none)
Tags
Stats
Code
Related papers
- Experience Augmentation: Boosting And Accelerating Off-policy Multi-agent Reinforcement Learning (2020)0.00
- Exploring More When It Needs In Deep Reinforcement Learning (2021)0.00
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00
- Entropy Augmented Reinforcement Learning (2022)0.00
- Improved Exploration Through Latent Trajectory Optimization In Deep Deterministic Policy Gradient (2019)0.00
- Generalization Of Reinforcement Learning With Policy-aware Adversarial Data Augmentation (2021)0.00
- Learning To Explore With Meta-policy Gradient (2018)0.00
- Off-policy Reinforcement Learning With Model-based Exploration Augmentation (2025)0.00