论文标题

具有虚假发现率保证的自适应新颖性检测

Adaptive novelty detection with false discovery rate guarantee

论文作者

Marandon, Ariane, Lei, Lihua, Mary, David, Roquain, Etienne

论文摘要

本文研究了半监督的新颖性检测问题,其中一组“典型”测量可供研究人员使用。我们提出了一种灵活的方法,该方法是由多次测试和保形推理的最新进展进行的,这是一种灵活的方法,能够围绕任何概率分类算法并控制有限样本中检测到的新颖性的虚假发现率(FDR),而没有任何分布假设,而没有任何分布假设。与经常致力于预先指定的P值函数的经典FDR控制程序相反,Adadetect以数据适应的方式学习了转换,以将力量集中在区分嵌入式和异常值和异常值的方向上。受到多个测试文献的启发,我们进一步提出了Adadetect的变体,这些变体符合零值的比例,同时保持有限的样本FDR控制。这些方法在合成数据集和现实世界数据集上进行了说明,包括天体物理学的应用程序。

This paper studies the semi-supervised novelty detection problem where a set of "typical" measurements is available to the researcher. Motivated by recent advances in multiple testing and conformal inference, we propose AdaDetect, a flexible method that is able to wrap around any probabilistic classification algorithm and control the false discovery rate (FDR) on detected novelties in finite samples without any distributional assumption other than exchangeability. In contrast to classical FDR-controlling procedures that are often committed to a pre-specified p-value function, AdaDetect learns the transformation in a data-adaptive manner to focus the power on the directions that distinguish between inliers and outliers. Inspired by the multiple testing literature, we further propose variants of AdaDetect that are adaptive to the proportion of nulls while maintaining the finite-sample FDR control. The methods are illustrated on synthetic datasets and real-world datasets, including an application in astrophysics.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源