Learning Synthetic Environments And Reward Networks For Reinforcement Learning
2022 Β· Fabio Ferreira, Thomas Nierhoff, Andreas Saelinger, et al.
Abstract
We introduce Synthetic Environments (SEs) and Reward Networks (RNs), represented by neural networks, as proxy environment models for training Reinforcement Learning (RL) agents. We show that an agent, after being trained exclusively on the SE, is able to solve the corresponding real environment. While an SE acts as a full proxy to a real environment by learning about its state dynamics and rewards, an RN is a partial proxy that learns to augment or replace rewards. We use bi-level optimization to evolve SEs and RNs: the inner loop trains the RL agent, and the outer loop trains the parameters of the SE / RN via an evolution strategy. We evaluate our proposed new concept on a broad range of RL algorithms and classic control environments. In a one-to-one comparison, learning an SE proxy requires more interactions with the real environment than training agents only on the real environment. However, once such an SE has been learned, we do not need any interactions with the real environment
Authors
(none)
Tags
Stats
Related papers
- Learning Synthetic Environments For Reinforcement Learning With Evolution Strategies (2021)0.00
- Discovering Minimal Reinforcement Learning Environments (2024)0.00
- Evolutionary Reinforcement Learning: A Survey (2023)13.93
- Reward Models In Deep Reinforcement Learning: A Survey (2025)0.00
- Reward-sharing Relational Networks In Multi-agent Reinforcement Learning As A Framework For Emergent Behavior (2022)2.26
- Learning To Design Games: Strategic Environments In Reinforcement Learning (2017)0.00
- Emergent Social Learning Via Multi-agent Reinforcement Learning (2020)0.00
- Improving Generalization To New Environments And Removing Catastrophic Forgetting In Reinforcement Learning By Using An Eco-system Of Agents (2022)0.00