Curriculum Offline Imitation Learning
2021 Β· Minghuan Liu, Hanye Zhao, Zhengyu Yang, et al.
Abstract
Offline reinforcement learning (RL) tasks require the agent to learn from a pre-collected dataset with no further interactions with the environment. Despite the potential to surpass the behavioral policies, RL-based methods are generally impractical due to the training instability and bootstrapping the extrapolation errors, which always require careful hyperparameter tuning via online evaluation. In contrast, offline imitation learning (IL) has no such issues since it learns the policy directly without estimating the value function by bootstrapping. However, IL is usually limited in the capability of the behavioral policy and tends to learn a mediocre behavior from the dataset collected by the mixture of policies. In this paper, we aim to take advantage of IL but mitigate such a drawback. Observing that behavior cloning is able to imitate neighboring policies with less data, we propose \textit\{Curriculum Offline Imitation Learning (COIL)\}, which utilizes an experience picking strateg
Authors
(none)
Tags
Stats
Related papers
- A Policy-guided Imitation Approach For Offline Reinforcement Learning (2022)0.00
- When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? (2022)0.00
- Efficient Offline Reinforcement Learning: First Imitate, Then Improve (2024)1.91
- Using Offline Data To Speed Up Reinforcement Learning In Procedurally Generated Environments (2023)6.77
- Know Your Boundaries: The Necessity Of Explicit Behavioral Cloning In Offline RL (2022)0.00
- Bridging Offline Reinforcement Learning And Imitation Learning: A Tale Of Pessimism (2021)0.00
- Mitigating Covariate Shift In Imitation Learning Via Offline Data Without Great Coverage (2021)0.00
- Dual RL: Unification And New Methods For Reinforcement And Imitation Learning (2023)0.00