大型交叉混合效应模型的Pigeonhole随机梯度Langevin动力学

论文标题

大型交叉混合效应模型的Pigeonhole随机梯度Langevin动力学

Pigeonhole Stochastic Gradient Langevin Dynamics for Large Crossed Mixed Effects Models

论文作者

Zhang, Xinyu, Li, Cheng

论文摘要

大型交叉混合效应模型具有不平衡的结构和缺少的数据对标准贝叶斯后采样算法构成了重大的计算挑战，因为计算复杂性通常在观察次数中是超线性的。我们为这种交叉混合效应模型提出了两个有效的基于子集的随机梯度MCMC算法，这有助于对方差组件和回归系数的可扩展推断。第一种算法是用于平衡设计的，而没有缺少观察结果，在该设计中，我们利用Precision矩阵的闭合形式表达式用于完整的数据矩阵。我们称之为Pigonhole随机梯度Langevin Dynamics（PSGLD）的第二种算法是针对平衡和不平衡设计开发的，可能会有很大一部分缺失的观测值。我们的PSGLD算法通过运行短马尔可夫链，然后在每个MCMC迭代处采样了方差组件的模型参数和回归系数。我们通过显示从提出的算法到目标非concove后分布的输出分布的收敛来提供理论保证。基于合成和实际数据的多种数值实验表明，所提出的算法可以显着降低标准MCMC算法的计算成本，并更好地平衡近似准确性和计算效率。

Large crossed mixed effects models with imbalanced structures and missing data pose major computational challenges for standard Bayesian posterior sampling algorithms, as the computational complexity is usually superlinear in the number of observations. We propose two efficient subset-based stochastic gradient MCMC algorithms for such crossed mixed effects models, which facilitate scalable inference on both the variance components and regression coefficients. The first algorithm is developed for balanced design without missing observations, where we leverage the closed-form expression of the precision matrix for the full data matrix. The second algorithm, which we call the pigeonhole stochastic gradient Langevin dynamics (PSGLD), is developed for both balanced and unbalanced designs with potentially a large proportion of missing observations. Our PSGLD algorithm imputes the latent crossed random effects by running short Markov chains and then samples the model parameters of variance components and regression coefficients at each MCMC iteration. We provide theoretical guarantees by showing the convergence of the output distribution from the proposed algorithms to the target non-log-concave posterior distribution. A variety of numerical experiments based on both synthetic and real data demonstrate that the proposed algorithms can significantly reduce the computational cost of the standard MCMC algorithms and better balance the approximation accuracy and computational efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题