Efficient Meta Reinforcement Learning For Preference-based Fast Adaptation
2022 Β· Zhizhou Ren, Anji Liu, Yitao Liang, et al.
Abstract
Learning new task-specific skills from a few trials is a fundamental challenge for artificial intelligence. Meta reinforcement learning (meta-RL) tackles this problem by learning transferable policies that support few-shot adaptation to unseen tasks. Despite recent advances in meta-RL, most existing methods require the access to the environmental reward function of new tasks to infer the task objective, which is not realistic in many practical applications. To bridge this gap, we study the problem of few-shot adaptation in the context of human-in-the-loop reinforcement learning. We develop a meta-RL algorithm that enables fast policy adaptation with preference-based feedback. The agent can adapt to new tasks by querying human's preference between behavior trajectories instead of using per-step numeric rewards. By extending techniques from information theory, our approach can design query sequences to maximize the information gain from human interactions while tolerating the inherent er
Authors
(none)
Tags
Stats
Related papers
- Meta-reinforcement Learning With Universal Policy Adaptation: Provable Near-optimality Under All-task Optimum Comparator (2024)0.00
- Context Meta-reinforcement Learning Via Neuromodulation (2021)6.34
- Boosting Hierarchical Reinforcement Learning With Meta-learning For Complex Task Adaptation (2024)0.00
- Hypothesis Network Planned Exploration For Rapid Meta-reinforcement Learning Adaptation (2023)0.00
- Distributionally Adaptive Meta Reinforcement Learning (2022)2.26
- Memory Sequence Length Of Data Sampling Impacts The Adaptation Of Meta-reinforcement Learning Agents (2024)2.26
- Model-based Adversarial Meta-reinforcement Learning (2020)0.00
- A Tutorial On Meta-reinforcement Learning (2023)10.85