Supplementing Gradient-based Reinforcement Learning With Simple Evolutionary Ideas
2023 Β· Harshad Khadilkar
Abstract
We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL), through the use of evolutionary operators. The methodology uses a population of RL agents training with a common experience buffer, with occasional crossovers and mutations of the agents in order to search efficiently through the policy space. Unlike prior literature on combining evolutionary search (ES) with RL, this work does not generate a distribution of agents from a common mean and covariance matrix. Neither does it require the evaluation of the entire population of policies at every time step. Instead, we focus on gradient-based training throughout the life of every policy (individual), with a sparse amount of evolutionary exploration. The resulting algorithm is shown to be robust to hyperparameter variations. As a surprising corollary, we show that simply initialising and training multiple RL agents with a common memory (with no further evolutionary
Authors
(none)
Tags
Stats
Related papers
- Evolution-guided Policy Gradient In Reinforcement Learning (2018)0.00
- Improving Exploration In Evolution Strategies For Deep Reinforcement Learning Via A Population Of Novelty-seeking Agents (2017)0.00
- Accelerating Reinforcement Learning With A Directional-gaussian-smoothing Evolution Strategy (2020)6.77
- Evolutionary Reinforcement Learning: A Survey (2023)13.93
- CEM-RL: Combining Evolutionary And Gradient-based Methods For Policy Search (2018)0.00
- An Efficient Asynchronous Method For Integrating Evolutionary And Gradient-based Policy Search (2020)0.00
- Erl-re\(^2\): Efficient Evolutionary Reinforcement Learning With Shared State Representation And Individual Policy Representation (2022)0.00
- Collaborative Evolutionary Reinforcement Learning (2019)0.00