论文标题
使用BIGAN和MDL的分布式检测
Out-of-Distribution Detection using BiGAN and MDL
论文作者
论文摘要
我们考虑以下问题:我们有大量的普通数据数据集。现在,我们得到了一个新的,可能很小的数据集,我们要确定这些是正常数据,或者它们是否指示新现象。这是一个新颖的检测或分布外检测问题。一个例子是医学,其中正常数据适用于没有已知疾病的人,以及具有症状的新数据集人。其他示例可能是安全的。我们通过在正常数据上训练双向生成对抗网络(BIGAN)来解决此问题,并使用高斯图形模型来对输出进行建模。然后,我们在输出上使用通用源编码或最小描述长度(MDL)来确定它是否是新分布,以实现Kolmogorov和Martin-Löf随机性。我们将方法应用于MNIST数据和健康和患有川崎疾病的患者的现实心电图(ECG)数据集,并且在ROC曲线方面表现出比类似方法更好的性能。
We consider the following problem: we have a large dataset of normal data available. We are now given a new, possibly quite small, set of data, and we are to decide if these are normal data, or if they are indicating a new phenomenon. This is a novelty detection or out-of-distribution detection problem. An example is in medicine, where the normal data is for people with no known disease, and the new dataset people with symptoms. Other examples could be in security. We solve this problem by training a bidirectional generative adversarial network (BiGAN) on the normal data and using a Gaussian graphical model to model the output. We then use universal source coding, or minimum description length (MDL) on the output to decide if it is a new distribution, in an implementation of Kolmogorov and Martin-Löf randomness. We apply the methodology to both MNIST data and a real-world electrocardiogram (ECG) dataset of healthy and patients with Kawasaki disease, and show better performance in terms of the ROC curve than similar methods.