基于Langevin Dynamics算法E-Th $ \ Varepsilon $ o Poula用于不连续随机梯度的随机优化问题

论文标题

基于Langevin Dynamics算法E-Th $ \ Varepsilon $ o Poula用于不连续随机梯度的随机优化问题

Langevin dynamics based algorithm e-TH$\varepsilon$O POULA for stochastic optimization problems with discontinuous stochastic gradient

论文作者

Lim, Dong-Young, Neufeld, Ariel, Sabanis, Sotirios, Zhang, Ying

论文摘要

我们介绍了一种新的基于Langevin动力学的算法，称为e-th $ \ varepsilon $ o poula，以解决与不连续的随机梯度的优化问题，这些梯度自然出现在现实世界中，例如分数估计，矢量量化，cvar最小化，CVAR最小化和正则优化问题，这些问题涉及合并神经网络。我们在理论上和数字上都证明了e-th $ \ varepsilon $ o poula算法的适用性。更确切地说，在随机梯度平均是局部Lipschitz并满足无限条件下的一定凸度的条件下，我们在Wasserstein距离内建立了e-th $ \ varepsilon $ o poula的非反应误差范围，并为预期的多余风险提供了可控制的次要估算，可以对其进行仲裁。提供了金融和保险的三个关键应用程序，即多周期投资组合优化，多期投资组合优化中的转移学习以及保险索赔预测，涉及具有（泄漏）-Relu激活功能的神经网络。使用现实世界数据集进行的数值实验说明了与SGLD，TUSLA，ADAM和AMSGRAD相比，E-TH $ \ VAREPSILON $ O POULA的出色经验性能就模型准确性而言。

We introduce a new Langevin dynamics based algorithm, called e-TH$\varepsilon$O POULA, to solve optimization problems with discontinuous stochastic gradients which naturally appear in real-world applications such as quantile estimation, vector quantization, CVaR minimization, and regularized optimization problems involving ReLU neural networks. We demonstrate both theoretically and numerically the applicability of the e-TH$\varepsilon$O POULA algorithm. More precisely, under the conditions that the stochastic gradient is locally Lipschitz in average and satisfies a certain convexity at infinity condition, we establish non-asymptotic error bounds for e-TH$\varepsilon$O POULA in Wasserstein distances and provide a non-asymptotic estimate for the expected excess risk, which can be controlled to be arbitrarily small. Three key applications in finance and insurance are provided, namely, multi-period portfolio optimization, transfer learning in multi-period portfolio optimization, and insurance claim prediction, which involve neural networks with (Leaky)-ReLU activation functions. Numerical experiments conducted using real-world datasets illustrate the superior empirical performance of e-TH$\varepsilon$O POULA compared to SGLD, TUSLA, ADAM, and AMSGrad in terms of model accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题