抬起双线耦合平滑最小值优化的原始偶对偶的方法

论文标题

抬起双线耦合平滑最小值优化的原始偶对偶的方法

Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax Optimization

论文作者

Thekumparampil, Kiran Koshy, He, Niao, Oh, Sewoong

论文摘要

我们研究了双线耦合的最小问题：$ \ min_ {x} \ max_ {y} f（x） + y^\ pop a x -h（y）$，其中$ f $和$ h $都是强烈凸出的光滑功能，并接纳了一阶梯度型甲壳。令人惊讶的是，迄今为止，尚无已知的一阶算法获得$ω（（（（（（\ sqrt {\ sqrt {\ frac {\ frac {l_x}} {μ_x}}}} + \ frac {\ frac {\ | a \ | a \ | a \ | a \ | \sqrt{\frac{L_y}{μ_y}}) \log(\frac1{\varepsilon}))$ for solving this problem up to an $\varepsilon$ primal-dual gap in the general parameter regime, where $L_x, L_y,μ_x,μ_y$ are the corresponding smoothness and strongly convexity常数。我们通过设计第一个最佳算法，即升起的偶（LPD）方法来缩小这一差距。我们的方法将目标提升为扩展形式，该形式允许使用相同的原始偶框架以最佳和无缝地处理平滑项和双线性项。除最佳性外，我们的方法还产生了一种非常简单的单环算法，该算法仅使用一个渐变的Oracle调用。此外，当$ f $仅是凸面时，将相同的算法应用于平滑的目标，可以达到几乎最佳的迭代复杂性。我们还使用LPD方法提供了直接的单环算法，该算法可实现$ o（\ sqrt {\ frac {\ frac {\ frac {\ frac} {\ varepsilon}}} + \ frac {\ frac {\ frac {\ | a \ | a \ | a \ | | \ sqrt {\ frac {l_y} {\ varepsilon}}）$。关于二次最小问题和政策评估问题的数值实验进一步证明了我们算法在实践中的快速收敛性。

We study the bilinearly coupled minimax problem: $\min_{x} \max_{y} f(x) + y^\top A x - h(y)$, where $f$ and $h$ are both strongly convex smooth functions and admit first-order gradient oracles. Surprisingly, no known first-order algorithms have hitherto achieved the lower complexity bound of $Ω((\sqrt{\frac{L_x}{μ_x}} + \frac{\|A\|}{\sqrt{μ_x μ_y}} + \sqrt{\frac{L_y}{μ_y}}) \log(\frac1{\varepsilon}))$ for solving this problem up to an $\varepsilon$ primal-dual gap in the general parameter regime, where $L_x, L_y,μ_x,μ_y$ are the corresponding smoothness and strongly convexity constants. We close this gap by devising the first optimal algorithm, the Lifted Primal-Dual (LPD) method. Our method lifts the objective into an extended form that allows both the smooth terms and the bilinear term to be handled optimally and seamlessly with the same primal-dual framework. Besides optimality, our method yields a desirably simple single-loop algorithm that uses only one gradient oracle call per iteration. Moreover, when $f$ is just convex, the same algorithm applied to a smoothed objective achieves the nearly optimal iteration complexity. We also provide a direct single-loop algorithm, using the LPD method, that achieves the iteration complexity of $O(\sqrt{\frac{L_x}{\varepsilon}} + \frac{\|A\|}{\sqrt{μ_y \varepsilon}} + \sqrt{\frac{L_y}{\varepsilon}})$. Numerical experiments on quadratic minimax problems and policy evaluation problems further demonstrate the fast convergence of our algorithm in practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题