Procedural Generation Of Meta-reinforcement Learning Tasks
2023 Β· Thomas Miconi
Abstract
Open-endedness stands to benefit from the ability to generate an infinite variety of diverse, challenging environments. One particularly interesting type of challenge is meta-learning ("learning-to-learn"), a hallmark of intelligent behavior. However, the number of meta-learning environments in the literature is limited. Here we describe a parametrized space for simple meta-reinforcement learning (meta-RL) tasks with arbitrary stimuli. The parametrization allows us to randomly generate an arbitrary number of novel simple meta-learning tasks. The parametrization is expressive enough to include many well-known meta-RL tasks, such as bandit problems, the Harlow task, T-mazes, the Daw two-step task and others. Simple extensions allow it to capture tasks based on two-dimensional topological spaces, such as full mazes or find-the-spot domains. We describe a number of randomly generated meta-RL domains of varying complexity and discuss potential issues arising from random generation.
Authors
(none)
Tags
Stats
Related papers
- A Tutorial On Meta-reinforcement Learning (2023)10.85
- Emergence Of Collective Open-ended Exploration From Decentralized Meta-reinforcement Learning (2023)0.00
- Discovering Minimal Reinforcement Learning Environments (2024)0.00
- Unsupervised Meta-learning For Reinforcement Learning (2018)0.00
- On The Effectiveness Of Fine-tuning Versus Meta-reinforcement Learning (2022)0.00
- Improving Generalization In Meta-rl With Imaginary Tasks From Latent Dynamics Mixture (2021)0.00
- Improving Generalization In Meta Reinforcement Learning Using Learned Objectives (2019)0.00
- Guided Meta-policy Search (2019)0.00