Hypothesis Network Planned Exploration For Rapid Meta-reinforcement Learning Adaptation
2023 Β· Maxwell Joseph Jacobson, Rohan Menon, John Zeng, et al.
Abstract
Meta-Reinforcement Learning (Meta-RL) learns optimal policies across a series of related tasks. A central challenge in Meta-RL is rapidly identifying which previously learned task is most similar to a new one, in order to adapt to it quickly. Prior approaches, despite significant success, typically rely on passive exploration strategies such as periods of random action to characterize the new task in relation to the learned ones. While sufficient when tasks are clearly distinguishable, passive exploration limits adaptation speed when informative transitions are rare or revealed only by specific behaviors. We introduce Hypothesis-Planned Exploration (HyPE), a method that actively plans sequences of actions during adaptation to efficiently identify the most similar previously learned task. HyPE operates within a joint latent space, where state-action transitions from different tasks form distinct paths. This latent-space planning approach enables HyPE to serve as a drop-in improvement fo
Authors
(none)
Tags
Stats
Related papers
- Efficient Meta Reinforcement Learning For Preference-based Fast Adaptation (2022)0.00
- Hierarchical Meta-reinforcement Learning Via Automated Macro-action Discovery (2024)0.00
- Boosting Hierarchical Reinforcement Learning With Meta-learning For Complex Task Adaptation (2024)0.00
- Decoupling Exploration And Exploitation For Meta-reinforcement Learning Without Sacrifices (2020)0.00
- Exploration In Approximate Hyper-state Space For Meta Reinforcement Learning (2020)0.00
- First-explore, Then Exploit: Meta-learning To Solve Hard Exploration-exploitation Trade-offs (2023)0.00
- Context Meta-reinforcement Learning Via Neuromodulation (2021)6.34
- Guided Meta-policy Search (2019)0.00