平行MCMC而没有令人尴尬的失败

论文标题

平行MCMC而没有令人尴尬的失败

Parallel MCMC Without Embarrassing Failures

论文作者

de Souza, Daniel Augusto, Mesquita, Diego, Kaski, Samuel, Acerbi, Luigi

论文摘要

令人尴尬的平行马尔可夫链蒙特卡洛（MCMC）通过使用两步方法将平行计算利用并行计算来扩展贝叶斯的推理。首先，MCMC是在数据分区定义的（子）后期并行运行的。然后，服务器结合了本地结果。虽然有效，但该框架对副本采样的质量非常敏感。在组合阶段，会放大诸如缺失模式或低密度区域的误解之类的常见抽样问题，导致灾难性失败。在这项工作中，我们提出了一种新颖的组合策略来减轻此问题。我们的策略，平行的主动推理（PAI），利用高斯过程（GP）替代建模和主动学习。在将GPS安装到子任子上后，PAI（i）共享GP代理之间的信息，以涵盖缺失的模式；（ii）使用主动采样来单独改进近近近似值。我们在具有挑战性的基准中验证了PAI，包括重尾和多模式后期，以及对计算神经科学的现实应用。经验结果表明，PAI在以前的方法灾难性失败的情况下成功，开销很小。

Embarrassingly parallel Markov Chain Monte Carlo (MCMC) exploits parallel computing to scale Bayesian inference to large datasets by using a two-step approach. First, MCMC is run in parallel on (sub)posteriors defined on data partitions. Then, a server combines local results. While efficient, this framework is very sensitive to the quality of subposterior sampling. Common sampling problems such as missing modes or misrepresentation of low-density regions are amplified -- instead of being corrected -- in the combination phase, leading to catastrophic failures. In this work, we propose a novel combination strategy to mitigate this issue. Our strategy, Parallel Active Inference (PAI), leverages Gaussian Process (GP) surrogate modeling and active learning. After fitting GPs to subposteriors, PAI (i) shares information between GP surrogates to cover missing modes; and (ii) uses active sampling to individually refine subposterior approximations. We validate PAI in challenging benchmarks, including heavy-tailed and multi-modal posteriors and a real-world application to computational neuroscience. Empirical results show that PAI succeeds where previous methods catastrophically fail, with a small communication overhead.

下载PDF全文

下载文献需遵守相关版权规定

论文标题