Adversarial Policies: Attacking Deep Reinforcement Learning
2019 Β· Adam Gleave, Michael Dennis, Cody Wild, et al.
Abstract
Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers. However, an attacker is not usually able to directly modify another agent's observations. This might lead one to wonder: is it possible to attack an RL agent simply by choosing an adversarial policy acting in a multi-agent environment so as to create natural observations that are adversarial? We demonstrate the existence of adversarial policies in zero-sum games between simulated humanoid robots with proprioceptive observations, against state-of-the-art victims trained via self-play to be robust to opponents. The adversarial policies reliably win against the victims but generate seemingly random and uncoordinated behavior. We find that these policies are more successful in high-dimensional environments, and induce substantially different activations in the victim policy network than when the victim plays against a norm
Authors
(none)
Tags
Stats
Related papers
- Observed Adversaries In Deep Reinforcement Learning (2022)0.00
- Real-time Adversarial Perturbations Against Deep Reinforcement Learning Policies: Attacks And Defenses (2021)0.00
- Neutral Agent-based Adversarial Policy Learning Against Deep Reinforcement Learning In Multi-party Open Systems (2025)0.00
- Targeted Adversarial Attacks On Deep Reinforcement Learning Policies Via Model Checking (2022)2.26
- Query-based Targeted Action-space Adversarial Policies On Deep Reinforcement Learning Agents (2020)0.00
- Online Robust Policy Learning In The Presence Of Unknown Adversaries (2018)0.00
- Regret-based Defense In Adversarial Reinforcement Learning (2023)0.00
- Learning To Cope With Adversarial Attacks (2019)0.00