用于模拟多模式分布的轮廓随机梯度Langevin动力学算法

论文标题

用于模拟多模式分布的轮廓随机梯度Langevin动力学算法

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

论文作者

Deng, Wei, Lin, Guang, Liang, Faming

论文摘要

我们提出了一种自适应加权的随机梯度Langevin Dynamics算法（SGLD），即所谓的轮廓随机梯度Langevin Dynamics（CSGLD），用于大数据统计中的贝叶斯学习。所提出的算法本质上是\ emph {可扩展的动态重要性采样器}，它自动\ emph {flattens}目标分布，因此可以极大地促进多模式分布的仿真。从理论上讲，我们证明了一种稳定性条件，并建立了自化参数的渐近收敛性，而与原始能量函数的非识别性无关；我们还为加权平均估计器提供了错误分析。从经验上讲，在包括CIFAR10和CIFAR100在内的多个基准数据集上测试了CSGLD算法。数值结果表明其优越性可以避免训练深层神经网络中的局部陷阱问题。

We propose an adaptively weighted stochastic gradient Langevin dynamics algorithm (SGLD), so-called contour stochastic gradient Langevin dynamics (CSGLD), for Bayesian learning in big data statistics. The proposed algorithm is essentially a \emph{scalable dynamic importance sampler}, which automatically \emph{flattens} the target distribution such that the simulation for a multi-modal distribution can be greatly facilitated. Theoretically, we prove a stability condition and establish the asymptotic convergence of the self-adapting parameter to a {\it unique fixed-point}, regardless of the non-convexity of the original energy function; we also present an error analysis for the weighted averaging estimators. Empirically, the CSGLD algorithm is tested on multiple benchmark datasets including CIFAR10 and CIFAR100. The numerical results indicate its superiority to avoid the local trap problem in training deep neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题