Novelty Search For Deep Reinforcement Learning Policy Network Weights By Action Sequence Edit Metric Distance
2019 Β· Ethan C. Jackson, Mark Daley
Abstract
Reinforcement learning (RL) problems often feature deceptive local optima, and learning methods that optimize purely for reward signal often fail to learn strategies for overcoming them. Deep neuroevolution and novelty search have been proposed as effective alternatives to gradient-based methods for learning RL policies directly from pixels. In this paper, we introduce and evaluate the use of novelty search over agent action sequences by string edit metric distance as a means for promoting innovation. We also introduce a method for stagnation detection and population resampling inspired by recent developments in the RL community that uses the same mechanisms as novelty search to promote and develop innovative policies. Our methods extend a state-of-the-art method for deep neuroevolution using a simple-yet-effective genetic algorithm (GA) designed to efficiently learn deep RL policy network weights. Experiments using four games from the Atari 2600 benchmark were conducted. Results provi
Authors
(none)
Tags
Stats
Related papers
- Improving Exploration In Evolution Strategies For Deep Reinforcement Learning Via A Population Of Novelty-seeking Agents (2017)0.00
- Solving Deep Reinforcement Learning Tasks With Evolution Strategies And Linear Policy Networks (2024)0.00
- PNS: Population-guided Novelty Search For Reinforcement Learning In Hard Exploration Environments (2018)7.16
- Adaptive Combination Of A Genetic Algorithm And Novelty Search For Deep Neuroevolution (2022)0.00
- Improving Policy Gradient By Exploring Under-appreciated Rewards (2016)0.00
- Learning Self-imitating Diverse Policies (2018)0.00
- Learning In Sparse Rewards Settings Through Quality-diversity Algorithms (2022)0.00
- Neuroevolution Is A Competitive Alternative To Reinforcement Learning For Skill Discovery (2022)0.00