Strategically Robust Multi-agent Reinforcement Learning With Linear Function Approximation
2026 Β· Jake Gonzales, Max Horwitz, Eric Mazumdar, et al.
Abstract
Provably efficient and robust equilibrium computation in general-sum Markov games remains a core challenge in multi-agent reinforcement learning. Nash equilibrium is computationally intractable in general and brittle due to equilibrium multiplicity and sensitivity to approximation error. We study Risk-Sensitive Quantal Response Equilibrium (RQRE), which yields a unique, smooth solution under bounded rationality and risk sensitivity. We propose \texttt\{RQRE-OVI\}, an optimistic value iteration algorithm for computing RQRE with linear function approximation in large or continuous state spaces. Through finite-sample regret analysis, we establish convergence and explicitly characterize how sample complexity scales with rationality and risk-sensitivity parameters. The regret bounds reveal a quantitative tradeoff: increasing rationality tightens regret, while risk sensitivity induces regularization that enhances stability and robustness. This exposes a Pareto frontier between expected perfo
Authors
(none)
Tags
Stats
Related papers
- Taming Equilibrium Bias In Risk-sensitive Multi-agent Reinforcement Learning (2024)0.00
- Distributionally Robust Online Markov Game With Linear Function Approximation (2025)0.00
- Exploration-exploitation In Multi-agent Competition: Convergence With Bounded Rationality (2021)0.00
- Provably Efficient Reinforcement Learning In Decentralized General-sum Markov Games (2021)0.00
- Provably Efficient Cooperative Multi-agent Reinforcement Learning With Function Approximation (2021)0.00
- Minimax-optimal Multi-agent Robust Reinforcement Learning (2024)0.00
- Minimax-optimal Multi-agent RL In Markov Games With A Generative Model (2022)2.26
- Robust Cooperative Multi-agent Reinforcement Learning:a Mean-field Type Game Perspective (2024)0.00