Meta-reinforcement Learning Robust To Distributional Shift Via Model Identification And Experience Relabeling
2020 Β· Russell Mendonca, Xinyang Geng, Chelsea Finn, et al.
Abstract
Reinforcement learning algorithms can acquire policies for complex tasks autonomously. However, the number of samples required to learn a diverse set of skills can be prohibitively large. While meta-reinforcement learning methods have enabled agents to leverage prior experience to adapt quickly to new tasks, their performance depends crucially on how close the new task is to the previously experienced tasks. Current approaches are either not able to extrapolate well, or can do so at the expense of requiring extremely large amounts of data for on-policy meta-training. In this work, we present model identification and experience relabeling (MIER), a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time. Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data, more easily than policies and value functions. These dynamics mo
Authors
(none)
Tags
Stats
Related papers
- Distributionally Adaptive Meta Reinforcement Learning (2022)2.26
- Mitigating Distribution Shift In Model-based Offline RL Via Shifts-aware Reward Learning (2024)0.00
- A Tutorial On Meta-reinforcement Learning (2023)10.85
- Double Meta-learning For Data Efficient Policy Optimization In Non-stationary Environments (2020)0.00
- Distributionally Robust Model-based Reinforcement Learning With Large State Spaces (2023)0.00
- Offline Meta-reinforcement Learning With Online Self-supervision (2021)0.00
- Live In The Moment: Learning Dynamics Model Adapted To Evolving Policy (2022)0.00
- Meta Reinforcement Learning With Distribution Of Exploration Parameters Learned By Evolution Strategies (2018)0.00