Improving Generalization In Meta Reinforcement Learning Using Learned Objectives
2019 · Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber
Abstract
Biological evolution has distilled the experiences of many learners into the general learning algorithms of humans. Our novel meta reinforcement learning algorithm MetaGenRL is inspired by this process. MetaGenRL distills the experiences of many complex agents to meta-learn a low-complexity neural objective function that decides how future individuals will learn. Unlike recent meta-RL algorithms, MetaGenRL can generalize to new environments that are entirely different from those used for meta-training. In some cases, it even outperforms human-engineered RL algorithms. MetaGenRL uses off-policy second-order gradients during meta-training that greatly increase its sample efficiency.
Authors
(none)
Tags
Stats
Related papers
- Meta-gradient Reinforcement Learning With An Objective Discovered Online (2020)0.00
- A Tutorial On Meta-reinforcement Learning (2023)10.85
- Enhancing Online Reinforcement Learning With Meta-learned Objective From Offline Data (2025)0.00
- Theoretical Analysis Of Meta Reinforcement Learning: Generalization Bounds And Convergence Guarantees (2024)10.35
- First-explore, Then Exploit: Meta-learning To Solve Hard Exploration-exploitation Trade-offs (2023)0.00
- Discovering General Reinforcement Learning Algorithms With Adversarial Environment Design (2023)0.00
- Decoupling Exploration And Exploitation For Meta-reinforcement Learning Without Sacrifices (2020)0.00
- RL\(^3\): Boosting Meta Reinforcement Learning Via RL Inside RL\(^2\) (2023)0.00