Feedback In Imitation Learning: The Three Regimes Of Covariate Shift
2021 Β· Jonathan Spencer, Sanjiban Choudhury, Arun Venkatraman, et al.
Abstract
Imitation learning practitioners have often noted that conditioning policies on previous actions leads to a dramatic divergence between "held out" error and performance of the learner in situ. Interactive approaches can provably address this divergence but require repeated querying of a demonstrator. Recent work identifies this divergence as stemming from a "causal confound" in predicting the current action, and seek to ablate causal aspects of current state using tools from causal inference. In this work, we argue instead that this divergence is simply another manifestation of covariate shift, exacerbated particularly by settings of feedback between decisions and input features. The learner often comes to rely on features that are strongly predictive of decisions, but are subject to strong covariate shift. Our work demonstrates a broad class of problems where this shift can be mitigated, both theoretically and practically, by taking advantage of a simulator but without any further q
Authors
(none)
Tags
Stats
Related papers
- Causal Imitation Learning Under Measurement Error And Distribution Shift (2026)0.00
- Causal Imitation Learning With Unobserved Confounders (2022)0.00
- Causal Confusion In Imitation Learning (2019)0.00
- Causal Imitation Learning Under Temporally Correlated Noise (2022)0.00
- Confounded Causal Imitation Learning With Instrumental Variables (2025)0.00
- Causal Transfer For Imitation Learning And Decision Making Under Sensor-shift (2020)5.84
- Mitigating Covariate Shift In Imitation Learning Via Offline Data Without Great Coverage (2021)0.00
- Generalization Across Observation Shifts In Reinforcement Learning (2023)0.00