论文标题
增强无参数的弗兰克·沃尔夫(Frank Wolfe)
Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem
论文作者
论文摘要
为了在结构约束下进行凸优化,这项工作介绍并分析了称为Extraprfw的Frank Wolfe(FW)算法的变体。 Extrafw的独特特征是通过迭代利用了一对梯度,这要归功于该决策变量以预测校正(PC)格式更新。依靠步骤大小的无问题依赖性参数,对一般凸问题的超级FW的收敛速率显示为$ {\ cal o}(\ frac {1} {k})$,在与已解决的FW子问题的较低限制的意义上是最佳的。但是,Extrafw的优点是其更快的速率$ {\ cal o} \ big(\ frac {1} {k^2} \ big)$在一类机器学习问题上。与其他无参数的FW变体相比,在同一问题上速率更快的速率,Extraffw的速率和细粒度分析得益于其PC更新。具有不同稀疏性促进性约束的二进制分类的数值测试表明,Extrafw的经验性能明显优于FW,甚至比Nesterov在某些数据集中的加速梯度更快。为了完成矩阵,Extraxfw具有较小的最佳差距,而比FW较低。
Aiming at convex optimization under structural constraints, this work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW. The distinct feature of ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format. Relying on no problem dependent parameters in the step sizes, the convergence rate of ExtraFW for general convex problems is shown to be ${\cal O}(\frac{1}{k})$, which is optimal in the sense of matching the lower bound on the number of solved FW subproblems. However, the merit of ExtraFW is its faster rate ${\cal O}\big(\frac{1}{k^2} \big)$ on a class of machine learning problems. Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update. Numerical tests on binary classification with different sparsity-promoting constraints demonstrate that the empirical performance of ExtraFW is significantly better than FW, and even faster than Nesterov's accelerated gradient on certain datasets. For matrix completion, ExtraFW enjoys smaller optimality gap, and lower rank than FW.