Double Meta-learning For Data Efficient Policy Optimization In Non-stationary Environments
2020 Β· Elahe Aghapour, Nora Ayanian
Abstract
We are interested in learning models of non-stationary environments, which can be framed as a multi-task learning problem. Model-free reinforcement learning algorithms can achieve good asymptotic performance in multi-task learning at a cost of extensive sampling, due to their approach, which requires learning from scratch. While model-based approaches are among the most data efficient learning algorithms, they still struggle with complex tasks and model uncertainties. Meta-reinforcement learning addresses the efficiency and generalization challenges on multi task learning by quickly leveraging the meta-prior policy for a new task. In this paper, we propose a meta-reinforcement learning approach to learn the dynamic model of a non-stationary environment to be used for meta-policy optimization later. Due to the sample efficiency of model-based learning methods, we are able to simultaneously train both the meta-model of the non-stationary environment and the meta-policy until dynamic mode
Authors
(none)
Tags
Stats
Related papers
- A Model-based Approach For Sample-efficient Multi-task Reinforcement Learning (2019)0.00
- A Policy Gradient Algorithm For Learning To Learn In Multiagent Reinforcement Learning (2020)0.00
- Efficient Meta Reinforcement Learning For Preference-based Fast Adaptation (2022)0.00
- Offline Meta-reinforcement Learning With Online Self-supervision (2021)0.00
- Guided Meta-policy Search (2019)0.00
- Towards An Adaptable And Generalizable Optimization Engine In Decision And Control: A Meta Reinforcement Learning Approach (2024)0.00
- Distributionally Adaptive Meta Reinforcement Learning (2022)2.26
- Context Meta-reinforcement Learning Via Neuromodulation (2021)6.34