Efficient Differentially Private Fine-tuning Of Llms Via Reinforcement Learning
2025 Β· Afshin Khadangi, Amir Sartipi, Igor Tchappi, et al.
Abstract
The tension between data privacy and model utility has become the defining bottleneck for the practical deployment of large language models (LLMs) trained on sensitive corpora including healthcare. Differentially private stochastic gradient descent (DP-SGD) guarantees formal privacy, yet it does so at a pronounced cost: gradients are forcibly clipped and perturbed with noise, degrading sample efficiency and final accuracy. Numerous variants have been proposed to soften this trade-off, but they all share a handicap: their control knobs are hard-coded, global, and oblivious to the evolving optimization landscape. Consequently, practitioners are forced either to over-spend privacy budget in pursuit of utility, or to accept mediocre models in order to stay within privacy constraints. We present RLDP, the first framework to cast DP optimization itself as a closed-loop control problem amenable to modern deep reinforcement learning (RL). RLDP continuously senses rich statistics of the learnin
Authors
(none)
Tags
Stats
Related papers
- Privacy-preserving Reinforcement Learning From Human Feedback Via Decoupled Reward Modeling (2026)0.00
- Local Differential Privacy For Regret Minimization In Reinforcement Learning (2020)0.00
- Locally Private Distributed Reinforcement Learning (2020)0.00
- Near-optimal Differentially Private Reinforcement Learning (2022)0.00
- Offline Reinforcement Learning With Differential Privacy (2022)0.00
- Kl-regularization Itself Is Differentially Private In Bandits And RLHF (2025)0.00
- SPG: Sandwiched Policy Gradient For Masked Diffusion Language Models (2025)0.00
- Differentially Private Policy Evaluation (2016)0.00