关于贝叶斯深度学习的优化和修剪

论文标题

关于贝叶斯深度学习的优化和修剪

On the optimization and pruning for Bayesian deep learning

论文作者

Ke, Xiongwen, Fan, Yanan

论文摘要

贝叶斯深度学习的目的是通过后分布提供不确定性定量。但是，由于神经网络的超高维度，对体重空间的精确推断在计算上是棘手的。变异推理（VI）是一种有前途的方法，但是在体重空间上的幼稚应用不能很好地扩展，并且通常在预测精度上表现不佳。在本文中，我们提出了一种新的自适应变分贝叶斯算法，以在体重空间上训练神经网络，以实现高预测精度。通过证明与随机梯度的汉密尔顿蒙特卡洛（SGHMC）具有等效性，然后我们在EM算法中提出了一个MCMC，在捕获神经网络的稀疏性之前，它在EM算法中提出了一个MCMC。 EM-MCMC算法使我们能够在一击中执行优化和模型修剪。我们评估了有关CIFAR-10，CIFAR-100和Imagenet数据集的方法，并证明我们的密集模型可以达到最先进的性能，并且与先前提出的修剪方案相比，我们的稀疏模型表现良好。

The goal of Bayesian deep learning is to provide uncertainty quantification via the posterior distribution. However, exact inference over the weight space is computationally intractable due to the ultra-high dimensions of the neural network. Variational inference (VI) is a promising approach, but naive application on weight space does not scale well and often underperform on predictive accuracy. In this paper, we propose a new adaptive variational Bayesian algorithm to train neural networks on weight space that achieves high predictive accuracy. By showing that there is an equivalence to Stochastic Gradient Hamiltonian Monte Carlo(SGHMC) with preconditioning matrix, we then propose an MCMC within EM algorithm, which incorporates the spike-and-slab prior to capture the sparsity of the neural network. The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot. We evaluate our methods on CIFAR-10, CIFAR-100 and ImageNet datasets, and demonstrate that our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题