论文标题
零和通用双线游戏中乐观的梯度下降上升
Optimistic Gradient Descent Ascent in Zero-Sum and General-Sum Bilinear Games
论文作者
论文摘要
我们研究了不受约束的双线性游戏中乐观梯度上升的融合。在第一部分中,我们考虑了零和情况,并扩展了Daskalakis等人的先前结果。在2018年,Liang和Stokes在2019年等等:我们证明,对于任何回报矩阵,OGDA与鞍点的指数融合,还为收敛提供了新的,最佳的几何比率。我们还表征了诱导收敛的步骤大小,并能够推断出收敛速度的最佳步长。 In a second part, we introduce OGDA for general-sum bilinear games: we show that in an interesting class of games, either OGDA converges exponentially fast to a Nash equilibrium, or the payoffs for both players converge exponentially fast to $+\infty$ (which might be interpreted as endogenous emergence of coordination, or cooperation, among players).我们还提供了足够的条件,可以使OGDA融合到NASH平衡。这些条件用于通过使用Moore-Penrose倒数矩阵$ a $ a $引入一般 - $ A $,以提高涉及矩阵$ a $的Min-Max问题的收敛速度。据我们所知,这首先表明通用游戏可用于最佳改进针对最小问题问题的算法。我们最终在生成对抗网络的简单示例上说明了我们的结果。
We study the convergence of Optimistic Gradient Descent Ascent in unconstrained bilinear games. In a first part, we consider the zero-sum case and extend previous results by Daskalakis et al. in 2018, Liang and Stokes in 2019, and others: we prove, for any payoff matrix, the exponential convergence of OGDA to a saddle point and also provide a new, optimal, geometric ratio for the convergence. We also characterize the step sizes inducing convergence, and are able to deduce the optimal step size for the speed of convergence. In a second part, we introduce OGDA for general-sum bilinear games: we show that in an interesting class of games, either OGDA converges exponentially fast to a Nash equilibrium, or the payoffs for both players converge exponentially fast to $+\infty$ (which might be interpreted as endogenous emergence of coordination, or cooperation, among players). We also give sufficient conditions for convergence of OGDA to a Nash equilibrium. These conditions are used to increase the speed of convergence of a min-max problem involving a matrix $A$, by introducing a general-sum game using the Moore-Penrose inverse matrix of $A$. This shows for the first time, at our knowledge, that general-sum games can be used to optimally improve algorithms designed for min-max problems. We finally illustrate our results on simple examples of Generative Adversarial Networks.