PNS: Population-guided Novelty Search For Reinforcement Learning In Hard Exploration Environments
2018 Β· Qihao Liu, Yujia Wang, Xiaofeng Liu
Abstract
Reinforcement Learning (RL) has made remarkable achievements, but it still suffers from inadequate exploration strategies, sparse reward signals, and deceptive reward functions. To alleviate these problems, a Population-guided Novelty Search (PNS) parallel learning method is proposed in this paper. In PNS, the population is divided into multiple sub-populations, each of which has one chief agent and several exploring agents. The chief agent evaluates the policies learned by exploring agents and shares the optimal policy with all sub-populations. The exploring agents learn their policies in collaboration with the guidance of the optimal policy and, simultaneously, upload their policies to the chief agent. To balance exploration and exploitation, the Novelty Search (NS) is employed in every chief agent to encourage policies with high novelty while maximizing per-episode performance. We apply PNS to the twin delayed deep deterministic (TD3) policy gradient algorithm. The effectiveness of
Authors
(none)
Tags
Stats
Related papers
- Improving Exploration In Evolution Strategies For Deep Reinforcement Learning Via A Population Of Novelty-seeking Agents (2017)0.00
- Novelty Search For Deep Reinforcement Learning Policy Network Weights By Action Sequence Edit Metric Distance (2019)8.09
- Adaptive Combination Of A Genetic Algorithm And Novelty Search For Deep Neuroevolution (2022)0.00
- Learning In Sparse Rewards Settings Through Quality-diversity Algorithms (2022)0.00
- Ardns-fn-quantum: A Quantum-enhanced Reinforcement Learning Framework With Cognitive-inspired Adaptive Exploration For Dynamic Environments (2025)2.26
- Never Give Up: Learning Directed Exploration Strategies (2020)0.00
- Continuously Discovering Novel Strategies Via Reward-switching Policy Optimization (2022)0.00
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00