Local Differential Privacy For Regret Minimization In Reinforcement Learning
2020 Β· Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke, et al.
Abstract
Reinforcement learning algorithms are widely used in domains where it is desirable to provide a personalized service. In these domains it is common that user data contains sensitive information that needs to be protected from third parties. Motivated by this, we study privacy in the context of finite-horizon Markov Decision Processes (MDPs) by requiring information to be obfuscated on the user side. We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework. We establish a lower bound for regret minimization in finite-horizon MDPs with LDP guarantees which shows that guaranteeing privacy has a multiplicative effect on the regret. This result shows that while LDP is an appealing notion of privacy, it makes the learning problem significantly more complex. Finally, we present an optimistic algorithm that simultaneously satisfies \(\epsilon\)-LDP requirements, and achieves \(\sqrt\{K\}/\epsilon\) regret in any finite-horizon MDP after \(K\) epis
Authors
(none)
Tags
Stats
Related papers
- Near-optimal Differentially Private Reinforcement Learning (2022)0.00
- Offline Reinforcement Learning With Differential Privacy (2022)0.00
- Efficient Differentially Private Fine-tuning Of Llms Via Reinforcement Learning (2025)0.00
- Locally Private Distributed Reinforcement Learning (2020)0.00
- Refined Regret For Adversarial Mdps With Linear Function Approximation (2023)0.00
- Privacy-preserving Reinforcement Learning From Human Feedback Via Decoupled Reward Modeling (2026)0.00
- Regret Bounds For Discounted Mdps (2020)0.00
- Regret Analysis In Deterministic Reinforcement Learning (2021)0.00