Minimalistic Attacks: How Little It Takes To Fool A Deep Reinforcement Learning Policy
2019 Β· Xinghua Qu, Zhu Sun, Yew-Soon Ong, et al.
Abstract
Recent studies have revealed that neural network-based policies can be easily fooled by adversarial examples. However, while most prior works analyze the effects of perturbing every pixel of every frame assuming white-box policy access, in this paper we take a more restrictive view towards adversary generation - with the goal of unveiling the limits of a model's vulnerability. In particular, we explore minimalistic attacks by defining three key settings: (1) black-box policy access: where the attacker only has access to the input (state) and output (action probability) of an RL policy; (2) fractional-state adversary: where only several pixels are perturbed, with the extreme case being a single-pixel adversary; and (3) tactically-chanced attack: where only significant frames are tactically chosen to be attacked. We formulate the adversarial attack by accommodating the three key settings and explore their potency on six Atari games by examining four fully trained state-of-the-art policie
Authors
(none)
Tags
Stats
Related papers
- Adversarial Policies: Attacking Deep Reinforcement Learning (2019)0.00
- Real-time Adversarial Perturbations Against Deep Reinforcement Learning Policies: Attacks And Defenses (2021)0.00
- Investigating Vulnerabilities Of Deep Neural Policies (2021)0.00
- Red Teaming With Mind Reading: White-box Adversarial Policies Against RL Agents (2022)0.00
- Copycat: Taking Control Of Neural Policies With Constant Attacks (2019)0.00
- Attacking And Defending Deep Reinforcement Learning Policies (2022)0.00
- Robust Deep Reinforcement Learning Against Adversarial Behavior Manipulation (2024)0.00
- Targeted Adversarial Attacks On Deep Reinforcement Learning Policies Via Model Checking (2022)2.26