Detecting Adversarial Directions In Deep Reinforcement Learning To Make Robust Decisions
2023 Β· Ezgi Korkmaz, Jonah Brown-Cohen
Abstract
Learning in MDPs with highly complex state representations is currently possible due to multiple advancements in reinforcement learning algorithm design. However, this incline in complexity, and furthermore the increase in the dimensions of the observation came at the cost of volatility that can be taken advantage of via adversarial attacks (i.e. moving along worst-case directions in the observation space). To solve this policy instability problem we propose a novel method to detect the presence of these non-robust directions via local quadratic approximation of the deep neural policy loss. Our method provides a theoretical basis for the fundamental cut-off between safe observations and adversarial observations. Furthermore, our technique is computationally efficient, and does not depend on the methods used to produce the worst-case directions. We conduct extensive experiments in the Arcade Learning Environment with several different adversarial attack techniques. Most significantly, w
Authors
(none)
Tags
Stats
Related papers
- Robust Deep Reinforcement Learning Against Adversarial Perturbations On State Observations (2020)0.00
- Deep Reinforcement Learning Policies Learn Shared Adversarial Features Across Mdps (2021)6.77
- Adversarial Policies: Attacking Deep Reinforcement Learning (2019)0.00
- Attacking And Defending Deep Reinforcement Learning Policies (2022)0.00
- Regret-based Defense In Adversarial Reinforcement Learning (2023)0.00
- Understanding Adversarial Attacks On Observations In Deep Reinforcement Learning (2021)0.00
- Online Robust Policy Learning In The Presence Of Unknown Adversaries (2018)0.00
- Rethinking Adversarial Attacks In Reinforcement Learning From Policy Distribution Perspective (2025)5.84