Offline Action-free Learning Of Ex-bmdps By Comparing Diverse Datasets
2025 Β· Alexander Levine, Peter Stone, Amy Zhang
Abstract
While sequential decision-making environments often involve high-dimensional observations, not all features of these observations are relevant for control. In particular, the observation space may capture factors of the environment which are not controllable by the agent, but which add complexity to the observation space. The need to ignore these "noise" features in order to operate in a tractably-small state space poses a challenge for efficient policy learning. Due to the abundance of video data available in many such environments, task-independent representation learning from action-free offline data offers an attractive solution. However, recent work has highlighted theoretical limitations in action-free learning under the Exogenous Block MDP (Ex-BMDP) model, where temporally-correlated noise features are present in the observations. To address these limitations, we identify a realistic setting where representation learning in Ex-BMDPs becomes tractable: when action-free video data
Authors
(none)
Tags
Stats
Related papers
- Learning A Fast Mixing Exogenous Block MDP Using A Single Trajectory (2024)0.00
- Provable RL With Exogenous Distractors Via Multistep Inverse Dynamics (2021)0.00
- Sample-efficient Reinforcement Learning In The Presence Of Exogenous Information (2022)0.00
- Exploiting Action Impact Regularity And Exogenous State Variables For Offline Reinforcement Learning (2021)0.00
- Asymptotically Optimal Reinforcement Learning In Block Markov Decision Processes (2025)0.00
- Multistep Inverse Is Not All You Need (2024)1.20
- Towards Principled Representation Learning From Videos For Reinforcement Learning (2024)0.00
- Debiased Offline Representation Learning For Fast Online Adaptation In Non-stationary Dynamics (2024)0.00