Online Estimation And Inference For Robust Policy Evaluation In Reinforcement Learning
2023 Β· Weidong Liu, Jiyuan Tu, Xi Chen, et al.
Abstract
Reinforcement learning has emerged as one of the prominent topics attracting attention in modern statistical learning, with policy evaluation being a key component. Unlike the traditional machine learning literature on this topic, our work emphasizes statistical inference for the model parameters and value functions of reinforcement learning algorithms. While most existing analyses assume random rewards to follow standard distributions, we embrace the concept of robust statistics in reinforcement learning by simultaneously addressing issues of outlier contamination and heavy-tailed rewards within a unified framework. In this paper, we develop a fully online robust policy evaluation procedure, and establish the Bahadur-type representation of our estimator. Furthermore, we develop an online procedure to efficiently conduct statistical inference based on the asymptotic distribution. This paper connects robust statistics and statistical inference in reinforcement learning, offering a more
Authors
(none)
Tags
Stats
Related papers
- Doubly Robust Interval Estimation For Optimal Policy Evaluation In Online Learning (2021)0.00
- Robust On-policy Sampling For Data-efficient Policy Evaluation In Reinforcement Learning (2021)0.00
- Online Bootstrap Inference For Policy Evaluation In Reinforcement Learning (2021)9.23
- Doubly Robust Off-policy Value And Gradient Estimation For Deterministic Policies (2020)0.00
- Expert-supervised Reinforcement Learning For Offline Policy Learning And Evaluation (2020)0.00
- Robust Fitted-q-evaluation And Iteration Under Sequentially Exogenous Unobserved Confounders (2023)0.00
- Online Robust Reinforcement Learning With Model Uncertainty (2021)0.00
- Towards Robust Off-policy Learning For Runtime Uncertainty (2022)0.00