Preserving Expert-level Privacy In Offline Reinforcement Learning
2024 Β· Navodita Sharma, Vishnu Vinod, Abhradeep Thakurta, et al.
Abstract
The offline reinforcement learning (RL) problem aims to learn an optimal policy from historical data collected by one or more behavioural policies (experts) by interacting with an environment. However, the individual experts may be privacy-sensitive in that the learnt policy may retain information about their precise choices. In some domains like personalized retrieval, advertising and healthcare, the expert choices are considered sensitive data. To provably protect the privacy of such experts, we propose a novel consensus-based expert-level differentially private offline RL training approach compatible with any existing offline RL algorithm. We prove rigorous differential privacy guarantees, while maintaining strong empirical performance. Unlike existing work in differentially private RL, we supplement the theory with proof-of-concept experiments on classic RL environments featuring large continuous state spaces, demonstrating substantial improvements over a natural baseline across mu
Authors
(none)
Tags
Stats
Related papers
- Offline Reinforcement Learning With Differential Privacy (2022)0.00
- Expert-supervised Reinforcement Learning For Offline Policy Learning And Evaluation (2020)0.00
- Privorl: Differentially Private Synthetic Dataset For Offline Reinforcement Learning (2025)0.00
- Privacy-preserving Reinforcement Learning From Human Feedback Via Decoupled Reward Modeling (2026)0.00
- Privacy-preserving Reinforcement Learning Beyond Expectation (2022)0.00
- Reinforcement Learning For Individual Optimal Policy From Heterogeneous Data (2025)0.00
- Near-optimal Differentially Private Reinforcement Learning (2022)0.00
- Expert Or Not? Assessing Data Quality In Offline Reinforcement Learning (2025)0.00