Decoupling Exploration And Exploitation For Meta-reinforcement Learning Without Sacrifices
2020 Β· Evan Zheran Liu, Aditi Raghunathan, Percy Liang, et al.
Abstract
The goal of meta-reinforcement learning (meta-RL) is to build agents that can quickly learn new tasks by leveraging prior experience on related tasks. Learning a new task often requires both exploring to gather task-relevant information and exploiting this information to solve the task. In principle, optimal exploration and exploitation can be learned end-to-end by simply maximizing task performance. However, such meta-RL approaches struggle with local optima due to a chicken-and-egg problem: learning to explore requires good exploitation to gauge the exploration's utility, but learning to exploit requires information gathered via exploration. Optimizing separate objectives for exploration and exploitation can avoid this problem, but prior meta-RL exploration objectives yield suboptimal policies that gather information irrelevant to the task. We alleviate both concerns by constructing an exploitation objective that automatically identifies task-relevant information and an exploration o
Authors
(none)
Tags
Stats
Related papers
- First-explore, Then Exploit: Meta-learning To Solve Hard Exploration-exploitation Trade-offs (2023)0.00
- Exploitation Is All You Need... For Exploration (2025)0.00
- Efficient Reinforcement Learning Via Decoupling Exploration And Utilization (2023)2.56
- Guided Meta-policy Search (2019)0.00
- Improving Generalization In Meta Reinforcement Learning Using Learned Objectives (2019)0.00
- Offline Meta Learning Of Exploration (2020)0.00
- Model-based Adversarial Meta-reinforcement Learning (2020)0.00
- Boosting Exploration In Multi-task Reinforcement Learning Using Adversarial Networks (2022)0.00