Sample-efficient Reinforcement Learning Via Counterfactual-based Data Augmentation
2020 Β· Chaochao Lu, Biwei Huang, Ke Wang, et al.
Abstract
Reinforcement learning (RL) algorithms usually require a substantial amount of interaction data and perform well only for specific tasks in a fixed environment. In some scenarios such as healthcare, however, usually only few records are available for each patient, and patients may show different responses to the same treatment, impeding the application of current RL algorithms to learn optimal policies. To address the issues of mechanism heterogeneity and related data scarcity, we propose a data-efficient RL algorithm that exploits structural causal models (SCMs) to model the state dynamics, which are estimated by leveraging both commonalities and differences across subjects. The learned SCM enables us to counterfactually reason what would have happened had another treatment been taken. It helps avoid real (possibly risky) exploration and mitigates the issue that limited experiences lead to biased policies. We propose counterfactual RL algorithms to learn both population-level and indi
Authors
(none)
Tags
Stats
Related papers
- Counterfactually Fair Reinforcement Learning Via Sequential Data Preprocessing (2025)0.00
- Learning Impartial Policies For Sequential Counterfactual Explanations Using Deep Reinforcement Learning (2023)0.00
- Mocoda: Model-based Counterfactual Data Augmentation (2022)2.26
- Counterfactual Experience Augmented Off-policy Reinforcement Learning (2025)0.00
- ACTER: Diverse And Actionable Counterfactual Sequences For Explaining And Diagnosing RL Policies (2024)0.00
- RACCER: Towards Reachable And Certain Counterfactual Explanations For Reinforcement Learning (2023)0.00
- Causal Deep Reinforcement Learning Using Observational Data (2022)5.84
- Provably Efficient Causal Reinforcement Learning With Confounded Observational Data (2020)0.00