Learning Causal States Under Partial Observability And Perturbation
2025 Β· Na Li, Hangguan Shan, Wei Ni, et al.
Abstract
A critical challenge for reinforcement learning (RL) is making decisions based on incomplete and noisy observations, especially in perturbed and partially observable Markov decision processes (P\(^2\)OMDPs). Existing methods fail to mitigate perturbations while addressing partial observability. We propose \textit\{Causal State Representation under Asynchronous Diffusion Model (CaDiff)\}, a framework that enhances any RL algorithm by uncovering the underlying causal structure of P\(^2\)OMDPs. This is achieved by incorporating a novel asynchronous diffusion model (ADM) and a new bisimulation metric. ADM enables forward and reverse processes with different numbers of steps, thus interpreting the perturbation of P\(^2\)OMDP as part of the noise suppressed through diffusion. The bisimulation metric quantifies the similarity between partially observable environments and their causal counterparts. Moreover, we establish the theoretical guarantee of CaDiff by deriving an upper bound for the va
Authors
(none)
Tags
Stats
Related papers
- Learning Causal State Representations Of Partially Observable Environments (2019)0.00
- A Relative Ignorability Framework For Decision-relevant Observability In Control Theory And Reinforcement Learning (2025)0.00
- Quantifying First-order Markov Violations In Noisy Reinforcement Learning: A Causal Discovery Approach (2025)0.00
- Causal Reinforcement Learning Using Observational And Interventional Data (2021)0.00
- Pessimism In The Face Of Confounders: Provably Efficient Offline Reinforcement Learning In Partially Observable Markov Decision Processes (2022)0.00
- Reinforcement Learning Under Partial Observability Guided By Learned Environment Models (2022)6.34
- Active Inference And Reinforcement Learning: A Unified Inference On Continuous State And Action Spaces Under Partial Observability (2022)5.84
- Provable Partially Observable Reinforcement Learning With Privileged Information (2024)2.26