Guided Meta-policy Search
2019 Β· Russell Mendonca, Abhishek Gupta, Rosen Kralev, et al.
Abstract
Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch. Meta-RL aims to address this challenge by leveraging experience from previous tasks so as to more quickly solve new tasks. However, in practice, these algorithms generally also require large amounts of on-policy experience during the meta-training process, making them impractical for use in many problems. To this end, we propose to learn a reinforcement learning procedure in a federated way, where individual off-policy learners can solve the individual meta-training tasks, and then consolidate these solutions into a single meta-learner. Since the central meta-learner learns by imitating the solutions to the individual tasks, it can accommodate either the standard meta-RL problem setting or a hybrid setting where some or all tasks are provided with example demonstrations. The former results in an approach that can le
Authors
(none)
Tags
Stats
Related papers
- A Tutorial On Meta-reinforcement Learning (2023)10.85
- Offline Meta-reinforcement Learning With Online Self-supervision (2021)0.00
- Decoupling Exploration And Exploitation For Meta-reinforcement Learning Without Sacrifices (2020)0.00
- Context Meta-reinforcement Learning Via Neuromodulation (2021)6.34
- Efficient Off-policy Meta-reinforcement Learning Via Probabilistic Context Variables (2019)0.00
- Unsupervised Meta-learning For Reinforcement Learning (2018)0.00
- Promp: Proximal Meta-policy Search (2018)0.00
- Efficient Meta Reinforcement Learning For Preference-based Fast Adaptation (2022)0.00