Provably Invincible Adversarial Attacks On Reinforcement Learning Systems: A Rate-distortion Information-theoretic Approach
2025 Β· Ziqing Lu, Lifeng Lai, Weiyu Xu
Abstract
Reinforcement learning (RL) for the Markov Decision Process (MDP) has emerged in many security-related applications, such as autonomous driving, financial decisions, and drone/robot algorithms. In order to improve the robustness/defense of RL systems against adversaries, studying various adversarial attacks on RL systems is very important. Most previous work considered deterministic adversarial attack strategies in MDP, which the recipient (victim) agent can defeat by reversing the deterministic attacks. In this paper, we propose a provably ``invincible'' or ``uncounterable'' type of adversarial attack on RL. The attackers apply a rate-distortion information-theoretic approach to randomly change agents' observations of the transition kernel (or other properties) so that the agent gains zero or very limited information about the ground-truth kernel (or other properties) during the training. We derive an information-theoretic lower bound on the recipient agent's reward regret and show th
Authors
(none)
Tags
Stats
Related papers
- Optimal Attack And Defense For Reinforcement Learning (2023)6.34
- Certifying Safety In Reinforcement Learning Under Adversarial Perturbation Attacks (2022)0.00
- Robust Reinforcement Learning On State Observations With Learned Optimal Adversary (2021)0.00
- Disturbing Reinforcement Learning Agents With Corrupted Rewards (2021)0.00
- Reinforcement Learning Under Threats (2018)9.59
- Real-time Adversarial Perturbations Against Deep Reinforcement Learning Policies: Attacks And Defenses (2021)0.00
- Adversarial Policies: Attacking Deep Reinforcement Learning (2019)0.00
- Improving Robustness Of Reinforcement Learning For Power System Control With Adversarial Training (2021)0.00