Sample-efficient Reinforcement Learning In The Presence Of Exogenous Information
2022 Β· Yonathan Efroni, Dylan J. Foster, Dipendra Misra, et al.
Abstract
In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. Learning from high-dimensional observations has been the subject of extensive investigation in supervised learning and statistics (e.g., via sparsity), but analogous issues in reinforcement learning are not well understood, even in finite state/action (tabular) domains. We introduce a new problem setting for reinforcement learning, the Exogenous Markov Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable (or, endogenous) component and a large irrelevant (or, exogenous) component; the exogenous component is independent of the learner's actions, but evolves in an arbitrary, temporally correlated fashion. We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity polynomial in the size of the endogenous component an
Authors
(none)
Tags
Stats
Related papers
- Provable RL With Exogenous Distractors Via Multistep Inverse Dynamics (2021)0.00
- Discovering And Removing Exogenous State Variables And Rewards For Reinforcement Learning (2018)0.00
- Offline Action-free Learning Of Ex-bmdps By Comparing Diverse Datasets (2025)0.00
- Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity (2021)0.00
- Gap-dependent Unsupervised Exploration For Reinforcement Learning (2021)0.00
- Sample-efficient Reinforcement Learning For Linearly-parameterized Mdps With A Generative Model (2021)0.00
- Sample And Oracle Efficient Reinforcement Learning For Mdps With Linearly-realizable Value Functions (2024)0.00
- Provably Efficient Exploration For Reinforcement Learning Using Unsupervised Learning (2020)0.00