Interpretable Off-policy Evaluation In Reinforcement Learning By Highlighting Influential Transitions
2020 Β· Omer Gottesman, Joseph Futoma, Yao Liu, et al.
Abstract
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education, but safe deployment in high stakes settings requires ways of assessing its validity. Traditional measures such as confidence intervals may be insufficient due to noise, limited data and confounding. In this paper we develop a method that could serve as a hybrid human-AI system, to enable human experts to analyze the validity of policy evaluation estimates. This is accomplished by highlighting observations in the data whose removal will have a large effect on the OPE estimate, and formulating a set of rules for choosing which ones to present to domain experts for validation. We develop methods to compute exactly the influence functions for fitted Q-evaluation with two different function classes: kernel-based and linear least squares, as well as importance sampling methods. Experiments on medical simulations and real-world i
Authors
(none)
Tags
Stats
Related papers
- Counterfactual-augmented Importance Sampling For Semi-offline Policy Evaluation (2023)0.00
- Off-policy Evaluation In Infinite-horizon Reinforcement Learning With Latent Confounders (2020)0.00
- Intrinsically Efficient, Stable, And Bounded Off-policy Evaluation For Reinforcement Learning (2019)0.00
- Empirical Study Of Off-policy Policy Evaluation For Reinforcement Learning (2019)0.00
- Towards Optimal Off-policy Evaluation For Reinforcement Learning With Marginalized Importance Sampling (2019)0.00
- Conformal Off-policy Evaluation In Markov Decision Processes (2023)7.16
- An Instrumental Variable Approach To Confounded Off-policy Evaluation (2022)0.00
- More Efficient Off-policy Evaluation Through Regularized Targeted Learning (2019)0.00