论文标题
培训依赖数据的归一流流量
Training Normalizing Flows from Dependent Data
论文作者
论文摘要
归一化流是强大的非参数统计模型,可作为密度估计器和生成模型之间的混合体。当前的学习算法用于归一流的流量假设数据点是独立采样的,这在实践中经常违反,这可能导致错误的密度估计和数据生成。我们提出了将流量归一化的可能性目标,该流量在数据点之间融合了依赖性,为此我们得出了一种适用于不同依赖性结构的灵活,有效的学习算法。我们表明,尊重观测之间的依赖性可以改善对合成和现实世界数据的经验结果,并在全基因组关联研究的下游应用中导致更高的统计能力。
Normalizing flows are powerful non-parametric statistical models that function as a hybrid between density estimators and generative models. Current learning algorithms for normalizing flows assume that data points are sampled independently, an assumption that is frequently violated in practice, which may lead to erroneous density estimation and data generation. We propose a likelihood objective of normalizing flows incorporating dependencies between the data points, for which we derive a flexible and efficient learning algorithm suitable for different dependency structures. We show that respecting dependencies between observations can improve empirical results on both synthetic and real-world data, and leads to higher statistical power in a downstream application to genome-wide association studies.