Universal Black-box Reward Poisoning Attack Against Offline Reinforcement Learning
2024 Β· Yinglun Xu, Rohan Gumaste, Gagandeep Singh
Abstract
We study the problem of universal black-boxed reward poisoning attacks against general offline reinforcement learning with deep neural networks. We consider a black-box threat model where the attacker is entirely oblivious to the learning algorithm, and its budget is limited by constraining the amount of corruption at each data point and the total perturbation. We require the attack to be universally efficient against any efficient algorithms that might be used by the agent. We propose an attack strategy called the `policy contrast attack.' The idea is to find low- and high-performing policies covered by the dataset and make them appear to be high- and low-performing to the agent, respectively. To the best of our knowledge, we propose the first universal black-box reward poisoning attack in the general offline RL setting. We provide theoretical insights on the attack design and empirically show that our attack is efficient against current state-of-the-art offline RL algorithms in diffe
Authors
(none)
Tags
Stats
Related papers
- Black-box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning (2023)0.00
- Reward Poisoning In Reinforcement Learning: Attacks Against Unknown Learners In Unknown Environments (2021)0.00
- Efficient Reward Poisoning Attacks On Online Deep Reinforcement Learning (2022)0.00
- Online Poisoning Attack Against Reinforcement Learning Under Black-box Environments (2024)0.00
- Reward Poisoning Attacks On Offline Multi-agent Reinforcement Learning (2022)0.00
- Policy Teaching In Reinforcement Learning Via Environment Poisoning Attacks (2020)0.00
- Vulnerability-aware Poisoning Mechanism For Online RL With Unknown Dynamics (2020)0.00
- Sleepernets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents (2024)0.00