COPA: Certifying Robust Policies For Offline Reinforcement Learning Against Poisoning Attacks
2022 Β· Fan Wu, Linyi Li, Chejian Xu, et al.
Abstract
As reinforcement learning (RL) has achieved near human-level performance in a variety of tasks, its robustness has raised great attention. While a vast body of research has explored test-time (evasion) attacks in RL and corresponding defenses, its robustness against training-time (poisoning) attacks remains largely unanswered. In this work, we focus on certifying the robustness of offline RL in the presence of poisoning attacks, where a subset of training trajectories could be arbitrarily manipulated. We propose the first certification framework, COPA, to certify the number of poisoning trajectories that can be tolerated regarding different certification criteria. Given the complex structure of RL, we propose two certification criteria: per-state action stability and cumulative reward bound. To further improve the certification, we propose new partition and aggregation protocols to train robust policies. We further prove that some of the proposed certification methods are theoretically
Authors
(none)
Tags
Stats
Related papers
- Vulnerability-aware Poisoning Mechanism For Online RL With Unknown Dynamics (2020)0.00
- Efficient Reward Poisoning Attacks On Online Deep Reinforcement Learning (2022)0.00
- Policy Teaching In Reinforcement Learning Via Environment Poisoning Attacks (2020)0.00
- Policy Resilience To Environment Poisoning Attacks On Reinforcement Learning (2023)0.00
- Reward Poisoning Attacks On Offline Multi-agent Reinforcement Learning (2022)0.00
- Universal Black-box Reward Poisoning Attack Against Offline Reinforcement Learning (2024)0.00
- Online Poisoning Attack Against Reinforcement Learning Under Black-box Environments (2024)0.00
- Understanding The Limits Of Poisoning Attacks In Episodic Reinforcement Learning (2022)3.58