Causal Deep Reinforcement Learning Using Observational Data
2022 Β· Wenxuan Zhu, Chao Yu, Qiang Zhang
Abstract
Deep reinforcement learning (DRL) requires the collection of interventional data, which is sometimes expensive and even unethical in the real world, such as in the autonomous driving and the medical field. Offline reinforcement learning promises to alleviate this issue by exploiting the vast amount of observational data available in the real world. However, observational data may mislead the learning agent to undesirable outcomes if the behavior policy that generates the data depends on unobserved random variables (i.e., confounders). In this paper, we propose two deconfounding methods in DRL to address this problem. The methods first calculate the importance degree of different samples based on the causal inference technique, and then adjust the impact of different samples on the loss function by reweighting or resampling the offline dataset to ensure its unbiasedness. These deconfounding methods can be flexibly combined with existing model-free DRL algorithms such as soft actor-criti
Authors
(none)
Tags
Stats
Related papers
- Provably Efficient Causal Reinforcement Learning With Confounded Observational Data (2020)0.00
- Causal Reinforcement Learning Using Observational And Interventional Data (2021)0.00
- Data Valuation For Offline Reinforcement Learning (2022)0.00
- Pessimistic Causal Reinforcement Learning With Mediators For Confounded Offline Data (2024)0.00
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Overcoming Model Bias For Robust Offline Deep Reinforcement Learning (2020)11.58
- Reccover: Detecting Causal Confusion For Explainable Reinforcement Learning (2022)0.00
- D4RL: Datasets For Deep Data-driven Reinforcement Learning (2020)0.00