零和通用双线游戏中乐观的梯度下降上升

论文标题

零和通用双线游戏中乐观的梯度下降上升

Optimistic Gradient Descent Ascent in Zero-Sum and General-Sum Bilinear Games

论文作者

de Montbrun, Étienne, Renault, Jérôme

论文摘要

我们研究了不受约束的双线性游戏中乐观梯度上升的融合。在第一部分中，我们考虑了零和情况，并扩展了Daskalakis等人的先前结果。在2018年，Liang和Stokes在2019年等等：我们证明，对于任何回报矩阵，OGDA与鞍点的指数融合，还为收敛提供了新的，最佳的几何比率。我们还表征了诱导收敛的步骤大小，并能够推断出收敛速度的最佳步长。 In a second part, we introduce OGDA for general-sum bilinear games: we show that in an interesting class of games, either OGDA converges exponentially fast to a Nash equilibrium, or the payoffs for both players converge exponentially fast to $+\infty$ (which might be interpreted as endogenous emergence of coordination, or cooperation, among players).我们还提供了足够的条件，可以使OGDA融合到NASH平衡。这些条件用于通过使用Moore-Penrose倒数矩阵$ a $ a $引入一般 - $ A $，以提高涉及矩阵$ a $的Min-Max问题的收敛速度。据我们所知，这首先表明通用游戏可用于最佳改进针对最小问题问题的算法。我们最终在生成对抗网络的简单示例上说明了我们的结果。

We study the convergence of Optimistic Gradient Descent Ascent in unconstrained bilinear games. In a first part, we consider the zero-sum case and extend previous results by Daskalakis et al. in 2018, Liang and Stokes in 2019, and others: we prove, for any payoff matrix, the exponential convergence of OGDA to a saddle point and also provide a new, optimal, geometric ratio for the convergence. We also characterize the step sizes inducing convergence, and are able to deduce the optimal step size for the speed of convergence. In a second part, we introduce OGDA for general-sum bilinear games: we show that in an interesting class of games, either OGDA converges exponentially fast to a Nash equilibrium, or the payoffs for both players converge exponentially fast to $+\infty$ (which might be interpreted as endogenous emergence of coordination, or cooperation, among players). We also give sufficient conditions for convergence of OGDA to a Nash equilibrium. These conditions are used to increase the speed of convergence of a min-max problem involving a matrix $A$, by introducing a general-sum game using the Moore-Penrose inverse matrix of $A$. This shows for the first time, at our knowledge, that general-sum games can be used to optimally improve algorithms designed for min-max problems. We finally illustrate our results on simple examples of Generative Adversarial Networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题