Learning to Defend: A Multi-Agent Reinforcement Learning Framework for Stackelberg Security Game in Mobile Edge Computing

Abstract

This paper addresses the general security problem in mobile edge computing. The problem is modeled as a two-stage attacking-defending Stackelberg security game, and a multiagent reinforcement learning framework based on independent proximal policy optimization is designed to solve the problem. The framework employs Markov decision processes for both the attacker and the defender to systematically represent network states, attack vectors, and resource allocation actions. The reward functions are designed to reflect the objectives of network disruption for the attacker and network protection for the defender. Experimental results demonstrate that our proposed framework achieves stable convergence and significantly reduces attack success rates compared to baseline methods in the randomly generated edge computing network.