用于弱凸随机优化的零阶近端随机梯度方法

论文标题

用于弱凸随机优化的零阶近端随机梯度方法

A Zeroth-order Proximal Stochastic Gradient Method for Weakly Convex Stochastic Optimization

论文作者

Pougkakiotis, Spyridon, Kalogerias, Dionysios S.

论文摘要

在本文中，我们分析了一种适合最小化弱凸随机优化问题的零阶近端随机梯度方法。我们考虑非平滑和非线性随机复合问题，为此（子）梯度信息可能不可用。提出的算法利用了众所周知的高斯平滑技术，该技术产生了相关的部分平滑替代问题的无偏零阶梯度估计器（原始问题目标中两个非平滑项之一替换为平滑的近似值）。这使我们可以使用标准的近端随机梯度方案来替代问题的近似解，该解决方案由单个平滑参数确定，而无需使用一阶信息。我们使用最小的假设为拟议的零阶方法提供最新的收敛速率。在标准相检索问题上，将所提出的方案与替代的零阶方法以及随机的亚梯度方案进行比较。此外，我们展示了我们方法对自动超参数调整独特设置的有用性和有效性。特别是，我们专注于自动调整优化算法的参数，以最大程度地减少新型的启发式模型。在$ \ Mathcal {l} _1/\ Mathcal {l} _2 _2 $调节的PDE PDE构成的最佳控制问题（可证明经验成功）的近端交替方向方法上测试了乘数的近端交替方向方法。

In this paper we analyze a zeroth-order proximal stochastic gradient method suitable for the minimization of weakly convex stochastic optimization problems. We consider nonsmooth and nonlinear stochastic composite problems, for which (sub-)gradient information might be unavailable. The proposed algorithm utilizes the well-known Gaussian smoothing technique, which yields unbiased zeroth-order gradient estimators of a related partially smooth surrogate problem (in which one of the two nonsmooth terms in the original problem's objective is replaced by a smooth approximation). This allows us to employ a standard proximal stochastic gradient scheme for the approximate solution of the surrogate problem, which is determined by a single smoothing parameter, and without the utilization of first-order information. We provide state-of-the-art convergence rates for the proposed zeroth-order method using minimal assumptions. The proposed scheme is numerically compared against alternative zeroth-order methods as well as a stochastic sub-gradient scheme on a standard phase retrieval problem. Further, we showcase the usefulness and effectiveness of our method for the unique setting of automated hyper-parameter tuning. In particular, we focus on automatically tuning the parameters of optimization algorithms by minimizing a novel heuristic model. The proposed approach is tested on a proximal alternating direction method of multipliers for the solution of $\mathcal{L}_1/\mathcal{L}_2$-regularized PDE-constrained optimal control problems, with evident empirical success.

下载PDF全文

下载文献需遵守相关版权规定

论文标题