Learning In Sparse Rewards Settings Through Quality-diversity Algorithms
2022 Β· Giuseppe Paolo
Abstract
In the Reinforcement Learning (RL) framework, the learning is guided through a reward signal. This means that in situations of sparse rewards the agent has to focus on exploration, in order to discover which action, or set of actions leads to the reward. RL agents usually struggle with this. Exploration is the focus of Quality-Diversity (QD) methods. In this thesis, we approach the problem of sparse rewards with these algorithms, and in particular with Novelty Search (NS). This is a method that only focuses on the diversity of the possible policies behaviors. The first part of the thesis focuses on learning a representation of the space in which the diversity of the policies is evaluated. In this regard, we propose the TAXONS algorithm, a method that learns a low-dimensional representation of the search space through an AutoEncoder. While effective, TAXONS still requires information on when to capture the observation used to learn said space. For this, we study multiple ways, and in pa
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Learning And Exploration Of Reachable Outcome Space (2019)0.00
- Learning Self-imitating Diverse Policies (2018)0.00
- Selection-expansion: A Unifying Framework For Motion-planning And Diversity Search Algorithms (2021)0.00
- Harnessing Distribution Ratio Estimators For Learning Agents With Quality And Diversity (2020)0.00
- Improving Exploration In Evolution Strategies For Deep Reinforcement Learning Via A Population Of Novelty-seeking Agents (2017)0.00
- Diversity Through Exclusion (DTE): Niche Identification For Reinforcement Learning Through Value-decomposition (2023)0.00
- Approximating Gradients For Differentiable Quality Diversity In Reinforcement Learning (2022)0.00
- Effective Diversity In Population Based Reinforcement Learning (2020)0.00