A Unified Variational Design of Predictive Mirror Descent in Convex Games under Stochastic Feedback
2026-06-01 • Computer Science and Game Theory
Computer Science and Game Theory
AI summaryⓘ
The authors study a method called mirror descent that helps learn strategies in games but sometimes struggles with certain unstable situations. They propose a new way to improve this method by adding a memory component and constructing a related game with predictive feedback. This new approach leads to better control of the learning process, especially near stable points, and they provide mathematical guarantees on how well it performs over time. Essentially, the authors offer a more unified and reliable way to understand and use predictive mirror descent in complex game scenarios.
Mirror descentPredictive feedbackStochastic differential gameFenchel dualityBregman divergenceLast-iterate convergenceBrownian diffusionLocal stabilityVariational methodsEquilibrium feedback
Authors
Yunian Pan, Tao Li, Quanyan Zhu
Abstract
Mirror descent provides a geometric framework for learning in games, but its last-iterate behavior can fail in weakly stable regimes, where the dynamics may exhibit rotational or recurrent transients. Predictive mirror methods mitigate this issue by modifying the feedback entering the mirror update, yet standard predictive variants are typically introduced algorithmically and analyzed one at a time. This letter gives a variational route to predictive feedback by constructing a stochastic mirror differential game with an auxiliary memory state. Its stage cost couples two Fenchel terms: a strategic term evaluated at a predicted profile and a corrective term driven by realized feedback. The resulting equilibrium feedback induces two-channel predictive mirror dynamics in general mirror geometry. Under local mirror regularity, a quantitative local Bregman growth condition, and bounded Brownian diffusion, we establish finite-horizon local terminal-time bounds in expectation and with high probability, together with an exit-probability estimate for the localization neighborhood. The result provides a unified variational construction of the induced predictive-memory mirror flow together with a local stochastic certificate for last-iterate performance near stable equilibria.