利用链条规则和贝叶斯定理比较概率分布

论文标题

利用链条规则和贝叶斯定理比较概率分布

Exploiting Chain Rule and Bayes' Theorem to Compare Probability Distributions

论文作者

Zheng, Huangjie, Zhou, Mingyuan

论文摘要

为了测量分别称为源和目标的两个概率分布之间的差异，我们利用链条规则和贝叶斯定理来构建条件传输（CT），这是由正向组件和向后构成的。正向CT是将源数据指向转移到目标的预期成本，其联合分布由源概率密度函数（PDF）的乘积定义，并且与源相关的条件分布相关，该分布与目标PDF通过贝叶斯定理有关。向后CT是通过反向方向来定义的。 CT成本可以通过替换源和目标PDF替换其在迷你批次上支持的离散经验分布，从而使其与隐式分布和基于随机梯度下降的优化相吻合。当应用于训练生成模型时，CT被证明可以在模式覆盖和寻求模式的行为和强烈抵抗模式崩溃之间取得良好的平衡。在用于生成建模的各种基准数据集上，用CT替换现有生成对抗网络的默认统计距离可始终如一地改善性能。提供了Pytorch代码。

To measure the difference between two probability distributions, referred to as the source and target, respectively, we exploit both the chain rule and Bayes' theorem to construct conditional transport (CT), which is constituted by both a forward component and a backward one. The forward CT is the expected cost of moving a source data point to a target one, with their joint distribution defined by the product of the source probability density function (PDF) and a source-dependent conditional distribution, which is related to the target PDF via Bayes' theorem. The backward CT is defined by reversing the direction. The CT cost can be approximated by replacing the source and target PDFs with their discrete empirical distributions supported on mini-batches, making it amenable to implicit distributions and stochastic gradient descent-based optimization. When applied to train a generative model, CT is shown to strike a good balance between mode-covering and mode-seeking behaviors and strongly resist mode collapse. On a wide variety of benchmark datasets for generative modeling, substituting the default statistical distance of an existing generative adversarial network with CT is shown to consistently improve the performance. PyTorch code is provided.

下载PDF全文

下载文献需遵守相关版权规定

论文标题