Poisoning Deep Reinforcement Learning Agents With In-distribution Triggers
2021 Β· Chace Ashcraft, Kiran Karra
Abstract
In this paper, we propose a new data poisoning attack and apply it to deep reinforcement learning agents. Our attack centers on what we call in-distribution triggers, which are triggers native to the data distributions the model will be trained on and deployed in. We outline a simple procedure for embedding these, and other, triggers in deep reinforcement learning agents following a multi-task learning paradigm, and demonstrate in three common reinforcement learning environments. We believe that this work has important implications for the security of deep learning models.
Authors
(none)
Tags
Stats
Related papers
- Execute Order 66: Targeted Data Poisoning For Reinforcement Learning (2022)0.00
- Trojdrl: Trojan Attacks On Deep Reinforcement Learning Agents (2019)0.00
- Efficient Reward Poisoning Attacks On Online Deep Reinforcement Learning (2022)0.00
- Black-box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning (2023)0.00
- Online Poisoning Attack Against Reinforcement Learning Under Black-box Environments (2024)0.00
- Adversarial Inception Backdoor Attacks Against Reinforcement Learning (2024)0.00
- Beyond Training-time Poisoning: Component-level And Post-training Backdoors In Deep Reinforcement Learning (2025)0.00
- Policy Teaching In Reinforcement Learning Via Environment Poisoning Attacks (2020)0.00