马尔可夫链分数上升：马尔可夫梯度的变异推理的统一框架

论文标题

马尔可夫链分数上升：马尔可夫梯度的变异推理的统一框架

Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

论文作者

Kim, Kyurae, Oh, Jisu, Gardner, Jacob R., Dieng, Adji Bousso, Kim, Hongseok

论文摘要

最小化纳入的kullback-leibler（KL）随机梯度下降（SGD）的差异很具有挑战性，因为其梯度被定义为后部的积分。最近，已经提出了多种方法运行SGD，并从马尔可夫链中获得偏置梯度估计。本文通过建立混合速率和梯度方差，对这些方法进行了首次对这些方法的非质合收敛分析。为此，我们证明了这些方法 - 我们共同称为马尔可夫链得分上升（MCSA）方法can被铸造为马尔可夫链梯度下降框架的特殊情况。此外，通过利用这种新的理解，我们开发了一种新颖的MCSA方案，即Parallel MCSA（PMCSA），该方案在梯度方差上实现了更严格的束缚。我们证明了这一改进的理论结果转化为卓越的经验表现。

Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient descent (SGD) is challenging since its gradient is defined as an integral over the posterior. Recently, multiple methods have been proposed to run SGD with biased gradient estimates obtained from a Markov chain. This paper provides the first non-asymptotic convergence analysis of these methods by establishing their mixing rate and gradient variance. To do this, we demonstrate that these methods-which we collectively refer to as Markov chain score ascent (MCSA) methods-can be cast as special cases of the Markov chain gradient descent framework. Furthermore, by leveraging this new understanding, we develop a novel MCSA scheme, parallel MCSA (pMCSA), that achieves a tighter bound on the gradient variance. We demonstrate that this improved theoretical result translates to superior empirical performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题