Popri: Private Federated Learning Using Preference-optimized Synthetic Data
2025 Β· Charlie Hou, Mei-Yu Wang, Yige Zhu, et al.
Abstract
In practical settings, differentially private Federated learning (DP-FL) is the dominant method for training models from private, on-device client data. Recent work has suggested that DP-FL may be enhanced or outperformed by methods that use DP synthetic data (Wu et al., 2024; Hou et al., 2024). The primary algorithms for generating DP synthetic data for FL applications require careful prompt engineering based on public information and/or iterative private client feedback. Our key insight is that the private client feedback collected by prior DP synthetic data methods (Hou et al., 2024; Xie et al., 2024) can be viewed as an RL (reinforcement learning) reward. Our algorithm, Policy Optimization for Private Data (POPri) harnesses client feedback using policy optimization algorithms such as Direct Preference Optimization (DPO) to fine-tune LLMs to generate high-quality DP synthetic data. To evaluate POPri, we release LargeFedBench, a new federated text benchmark for uncontaminated LLM eva
Authors
(none)
Tags
Stats
Related papers
- Privorl: Differentially Private Synthetic Dataset For Offline Reinforcement Learning (2025)0.00
- Optimized Local Updates In Federated Learning Via Reinforcement Learning (2025)0.00
- Privacy-preserving Reinforcement Learning From Human Feedback Via Decoupled Reward Modeling (2026)0.00
- Efficient Differentially Private Fine-tuning Of Llms Via Reinforcement Learning (2025)0.00
- A Fair Federated Learning Framework With Reinforcement Learning (2022)0.00
- Locally Private Distributed Reinforcement Learning (2020)0.00
- Federated Offline Policy Optimization With Dual Regularization (2024)3.58
- Fedhpd: Heterogeneous Federated Reinforcement Learning Via Policy Distillation (2025)2.26