Provably Efficient Reinforcement Learning In Partially Observable Dynamical Systems
2022 Β· Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, et al.
Abstract
We study Reinforcement Learning for partially observable dynamical systems using function approximation. We propose a new \textit\{Partially Observable Bilinear Actor-Critic framework\}, that is general enough to include models such as observable tabular Partially Observable Markov Decision Processes (POMDPs), observable Linear-Quadratic-Gaussian (LQG), Predictive State Representations (PSRs), as well as a newly introduced model Hilbert Space Embeddings of POMDPs and observable POMDPs with latent low-rank transition. Under this framework, we propose an actor-critic style algorithm that is capable of performing agnostic policy learning. Given a policy class that consists of memory based policies (that look at a fixed-length window of recent observations), and a value function class that consists of functions taking both memory and future observations as inputs, our algorithm learns to compete against the best memory-based policy in the given policy class. For certain examples such as un
Authors
(none)
Tags
Stats
Related papers
- Reinforcement Learning From Partial Observation: Linear Function Approximation With Provable Sample Efficiency (2022)0.00
- Provable Representation With Efficient Planning For Partial Observable Reinforcement Learning (2023)0.00
- Multi-agent Off-policy Actor-critic Reinforcement Learning For Partially Observable Environments (2024)2.26
- Computationally Efficient PAC RL In Pomdps With Latent Determinism And Conditional Embeddings (2022)0.00
- Robust Reinforcement Learning In Pomdps With Incomplete And Noisy Observations (2019)0.00
- Pessimism In The Face Of Confounders: Provably Efficient Offline Reinforcement Learning In Partially Observable Markov Decision Processes (2022)0.00
- Actor-critic Policy Optimization In Partially Observable Multiagent Environments (2018)0.00
- Unbiased Asymmetric Reinforcement Learning Under Partial Observability (2021)2.26