Learning A Fast Mixing Exogenous Block MDP Using A Single Trajectory
2024 Β· Alexander Levine, Peter Stone, Amy Zhang
Abstract
In order to train agents that can quickly adapt to new objectives or reward functions, efficient unsupervised representation learning in sequential decision-making environments can be important. Frameworks such as the Exogenous Block Markov Decision Process (Ex-BMDP) have been proposed to formalize this representation-learning problem (Efroni et al., 2022b). In the Ex-BMDP framework, the agent's high-dimensional observations of the environment have two latent factors: a controllable factor, which evolves deterministically within a small state space according to the agent's actions, and an exogenous factor, which represents time-correlated noise, and can be highly complex. The goal of the representation learning problem is to learn an encoder that maps from observations into the controllable latent space, as well as the dynamics of this space. Efroni et al. (2022b) has shown that this is possible with a sample complexity that depends only on the size of the controllable latent space, an
Authors
(none)
Tags
Stats
Related papers
- Offline Action-free Learning Of Ex-bmdps By Comparing Diverse Datasets (2025)0.00
- Provable RL With Exogenous Distractors Via Multistep Inverse Dynamics (2021)0.00
- Sample-efficient Reinforcement Learning In The Presence Of Exogenous Information (2022)0.00
- Learning Mixtures Of Markov Chains And Mdps (2022)0.00
- Asymptotically Optimal Reinforcement Learning In Block Markov Decision Processes (2025)0.00
- Learning Mdps From Features: Predict-then-optimize For Sequential Decision Problems By Reinforcement Learning (2021)0.00
- Deepmdp: Learning Continuous Latent Space Models For Representation Learning (2019)0.00
- An Intrinsically-motivated Approach For Learning Highly Exploring And Fast Mixing Policies (2019)6.34