Abstract
arXiv:2604.05129v2 Announce Type: replace-cross Abstract: We investigate the strategic surplus obtainable against a Follow-the-Regularized-Leader (FTRL) learner with constant step size $\eta$ in $n\times m$ two-player zero-sum games played over $T$ rounds against a clairvoyant optimizer. In contrast with prior analysis, we show that the extraction of such regret-scale surplus is an inherent feature of the FTRL family, rather than an artifact of specific instantiations. First, for a fixed max-min optimizer, we establish a sweeping law of order $\Omega(N_{\mathrm{sub}}/\eta)$, proving that utility surplus scales with the number of the learner's suboptimal actions $N$ and vanishes in their absence. Second, for an alternating optimizer, a surplus of $\Omega(\eta T/\mathrm{poly}(n,m))$ can be guaranteed regardless of the equilibrium structure, with high probability, in random games. Our analysis uncovers a sharp geometric dichotomy: non-steep regularizers allow the optimizer to realize the maximal transient surplus via finite-time elimination of suboptimal actions, whereas steep regularizers introduce a vanishing tail correction that can delay surplus saturation. Finally, we discuss whether this leverage persists under bilateral payoff uncertainty and propose a susceptibility measure quantifying which regularizers are most vulnerable to learner-aware strategic steering.