论文标题

无需学习和混合纳什平衡:它们不混合

No-regret learning and mixed Nash equilibria: They do not mix

论文作者

Flokas, Lampros, Vlatakis-Gkaragkounis, Emmanouil-Vasileios, Lianeas, Thanasis, Mertikopoulos, Panayotis, Piliouras, Georgios

论文摘要

了解一般$ n $ - 玩家游戏中无需重新激动的行为是在线学习和游戏理论中的一个基本问题。该领域的一个人指出,在有限的游戏中,在没有重新学习下的游戏经验频率与游戏集的粗相关平衡相聚。相比之下,我们对动态的日常行为与游戏的NASH均衡性如何相关的理解更加有限,并且只有部分结果是某些类别的游戏(例如零击或拥堵游戏)知道的。在本文中,我们研究了“遵循规范化领导者”(FTRL)的动力学,可以说是最精心研究的无缩放动力学类别,并且我们建立了一个全面的负面结果,表明混合NASH平衡的概念与No-Regret学习相反。具体来说,我们表明,任何不严格的NASH平衡(每个玩家都有唯一的最佳响应)都不能稳定,并且在FTRL的动态下都吸引人。该结果对预测学习过程的结果具有重要意义,因为它明确地表明,只有严格(因此,纯)nash平衡才能成为其稳定的限制点。

Understanding the behavior of no-regret dynamics in general $N$-player games is a fundamental question in online learning and game theory. A folk result in the field states that, in finite games, the empirical frequency of play under no-regret learning converges to the game's set of coarse correlated equilibria. By contrast, our understanding of how the day-to-day behavior of the dynamics correlates to the game's Nash equilibria is much more limited, and only partial results are known for certain classes of games (such as zero-sum or congestion games). In this paper, we study the dynamics of "follow-the-regularized-leader" (FTRL), arguably the most well-studied class of no-regret dynamics, and we establish a sweeping negative result showing that the notion of mixed Nash equilibrium is antithetical to no-regret learning. Specifically, we show that any Nash equilibrium which is not strict (in that every player has a unique best response) cannot be stable and attracting under the dynamics of FTRL. This result has significant implications for predicting the outcome of a learning process as it shows unequivocally that only strict (and hence, pure) Nash equilibria can emerge as stable limit points thereof.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源