论文标题
评估分类生成模型 - 弥合真实数据和合成数据之间的差距
Evaluation of Categorical Generative Models -- Bridging the Gap Between Real and Synthetic Data
论文作者
论文摘要
机器学习社区主要依靠真实数据来基准算法,因为它提供了令人信服的模型适用性证据。对合成数据集的评估可以是一种有力的工具,可以更好地了解模型的优势,劣势和整体功能。获得这些见解对于生成建模尤其重要,因为目标数量完全未知。文献中已经报道了与生成模型评估有关的多个问题。我们认为可以通过基于地面真理的评估来避免这些问题。对合成实验的一般批评是,它们太简化了,不能代表实际情况。因此,我们的实验设置是根据现实的生成任务量身定制的。我们专注于分类数据,并引入适当的可扩展评估方法。我们的方法涉及任务生成模型以在高维环境中学习分布。然后,我们将大空间备用,以获得可以应用有意义的统计测试的较小概率空间。我们考虑越来越大的概率空间,这些空间与日益困难的建模任务相对应,并基于他们可能达到的最高任务难度比较生成模型,然后才被发现远离地面真相。我们通过在合成生成模型和当前最新的分类生成模型上进行合成实验来验证我们的评估程序。
The machine learning community has mainly relied on real data to benchmark algorithms as it provides compelling evidence of model applicability. Evaluation on synthetic datasets can be a powerful tool to provide a better understanding of a model's strengths, weaknesses, and overall capabilities. Gaining these insights can be particularly important for generative modeling as the target quantity is completely unknown. Multiple issues related to the evaluation of generative models have been reported in the literature. We argue those problems can be avoided by an evaluation based on ground truth. General criticisms of synthetic experiments are that they are too simplified and not representative of practical scenarios. As such, our experimental setting is tailored to a realistic generative task. We focus on categorical data and introduce an appropriately scalable evaluation method. Our method involves tasking a generative model to learn a distribution in a high-dimensional setting. We then successively bin the large space to obtain smaller probability spaces where meaningful statistical tests can be applied. We consider increasingly large probability spaces, which correspond to increasingly difficult modeling tasks and compare the generative models based on the highest task difficulty they can reach before being detected as being too far from the ground truth. We validate our evaluation procedure with synthetic experiments on both synthetic generative models and current state-of-the-art categorical generative models.