Towards Principled Representation Learning From Videos For Reinforcement Learning
2024 · Dipendra Misra, Akanksha Saran, Tengyang Xie, et al.
Abstract
We study pre-training representations for decision-making using video data, which is abundantly available for tasks such as game agents and software testing. Even though significant empirical advances have been made on this problem, a theoretical understanding remains absent. We initiate the theoretical investigation into principled approaches for representation learning and focus on learning the latent state representations of the underlying MDP using video data. We study two types of settings: one where there is iid noise in the observation, and a more challenging setting where there is also the presence of exogenous noise, which is non-iid noise that is temporally correlated, such as the motion of people or cars in the background. We study three commonly used approaches: autoencoding, temporal contrastive learning, and forward modeling. We prove upper bounds for temporal contrastive learning and forward modeling in the presence of only iid noise. We show that these approaches can le
Authors
(none)
Tags
Stats
Related papers
- Value-consistent Representation Learning For Data-efficient Reinforcement Learning (2022)0.00
- Data-efficient Reinforcement Learning With Self-predictive Representations (2020)0.00
- Learning To Identify Critical States For Reinforcement Learning From Videos (2023)8.76
- Continual State Representation Learning For Reinforcement Learning Using Generative Replay (2018)0.00
- Visual Processing In Context Of Reinforcement Learning (2022)0.00
- Masked Autoencoding For Scalable And Generalizable Decision Making (2022)0.00
- Accelerating Representation Learning With View-consistent Dynamics In Data-efficient Reinforcement Learning (2022)0.00
- Offline Action-free Learning Of Ex-bmdps By Comparing Diverse Datasets (2025)0.00