Diversity Through Exclusion (DTE): Niche Identification For Reinforcement Learning Through Value-decomposition
2023 · Peter Sunehag, Alexander Sasha Vezhnevets, Edgar Duéñez-Guzmán, et al.
Abstract
Many environments contain numerous available niches of variable value, each associated with a different local optimum in the space of behaviors (policy space). In such situations it is often difficult to design a learning process capable of evading distraction by poor local optima long enough to stumble upon the best available niche. In this work we propose a generic reinforcement learning (RL) algorithm that performs better than baseline deep Q-learning algorithms in such environments with multiple variably-valued niches. The algorithm we propose consists of two parts: an agent architecture and a learning rule. The agent architecture contains multiple sub-policies. The learning rule is inspired by fitness sharing in evolutionary computation and applied in reinforcement learning using Value-Decomposition-Networks in a novel manner for a single-agent's internal population. It can concretely be understood as adding an extra loss term where one policy's experience is also used to update a
Authors
(none)
Tags
Stats
Related papers
- Learning In Sparse Rewards Settings Through Quality-diversity Algorithms (2022)0.00
- Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning (2019)0.00
- Dynamic Value Estimation For Single-task Multi-scene Reinforcement Learning (2020)0.00
- Rethinking Value Function Learning For Generalization In Reinforcement Learning (2022)0.00
- Effective Diversity In Population Based Reinforcement Learning (2020)0.00
- Diversity For Contingency: Learning Diverse Behaviors For Efficient Adaptation And Transfer (2023)0.00
- DGPO: Discovering Multiple Strategies With Diversity-guided Policy Optimization (2022)2.26
- MULEX: Disentangling Exploitation From Exploration In Deep RL (2019)0.00