Constrained Black-box Attacks Against Cooperative Multi-agent Reinforcement Learning
2025 Β· Amine Andam, Jamal Bentahar, Mustapha Hedabou
Abstract
Collaborative multi-agent reinforcement learning has rapidly evolved, offering state-of-the-art algorithms for real-world applications, including sensitive domains. However, a key challenge to its widespread adoption is the lack of a thorough investigation into its vulnerabilities to adversarial attacks. Existing work predominantly focuses on training-time attacks or unrealistic scenarios, such as access to policy weights or the ability to train surrogate policies. In this paper, we investigate new vulnerabilities under more challenging and constrained conditions, assuming an adversary can only collect and perturb the observations of deployed agents. We also consider scenarios where the adversary has no access at all (no observations, actions, or weights). Our main approach is to generate perturbations that intentionally misalign how victim agents see their environment. Our approach is empirically validated on three benchmarks and 22 environments, demonstrating its effectiveness across
Authors
(none)
Tags
Stats
Related papers
- SUB-PLAY: Adversarial Policies Against Partially Observed Multi-agent Reinforcement Learning Systems (2024)0.00
- Attacking Cooperative Multi-agent Reinforcement Learning By Adversarial Minority Influence (2023)0.00
- Toward Evaluating Robustness Of Reinforcement Learning With Adversarial Policy (2023)4.52
- Backdoor Attacks On Multiagent Collaborative Systems (2022)0.00
- Adversarial Attacks In Consensus-based Multi-agent Reinforcement Learning (2021)0.00
- Adversarial Attacks In Cooperative AI (2021)0.00
- Efficient Adversarial Attacks On Online Multi-agent Reinforcement Learning (2023)0.00
- Adversarial Policies: Attacking Deep Reinforcement Learning (2019)0.00