Is Behavior Cloning All You Need? Understanding Horizon In Imitation Learning
2024 Β· Dylan J. Foster, Adam Block, Dipendra Misra
Abstract
Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations, and has been widely applied to robotics, autonomous driving, and autoregressive text generation. The simplest approach to IL, behavior cloning (BC), is thought to incur sample complexity with unfavorable quadratic dependence on the problem horizon, motivating a variety of different online algorithms that attain improved linear horizon dependence under stronger assumptions on the data and the learner's access to the expert. We revisit the apparent gap between offline and online IL from a learning-theoretic perspective, with a focus on the realizable/well-specified setting with general policy classes up to and including deep neural networks. Through a new analysis of behavior cloning with the logarithmic loss, we show that it is possible to achieve horizon-independent sample complexity in offline IL whenever (i) the range of the cumulative payoffs is con
Authors
(none)
Tags
Stats
Related papers
- Interactive And Hybrid Imitation Learning: Provably Beating Behavior Cloning (2024)0.00
- Know Your Boundaries: The Necessity Of Explicit Behavioral Cloning In Offline RL (2022)0.00
- When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? (2022)0.00
- Swarm Behavior Cloning (2024)0.00
- Offline Imitation Learning By Controlling The Effective Planning Horizon (2024)0.00
- Curriculum Offline Imitation Learning (2021)0.00
- Minimax Optimal Online Imitation Learning Via Replay Estimation (2022)0.00
- Explaining Fast Improvement In Online Imitation Learning (2020)0.00