论文标题

拟合Bumphunter测试统计分布和全局P值估计

Fitting the BumpHunter test statistic distribution and global p-value estimation

论文作者

Vaslin, Louis, Calvet, Samuel, Barra, Vincent, Donini, Julien

论文摘要

在高能量物理学中,通常在给定参考的数据中寻找局部偏差。对于此任务,众所周知的Bumphunter算法允许进行与模型无关的偏差搜索,具有估算全局p值以说明其他地方效应的外观的优势。但是,此方法依赖于从参考背景采样的数千个伪数据直方图的产生和扫描。因此,准确地计算5σ$的全球意义需要大量计算资源。为了加快此过程并改善算法,我们在本文中建议使用更合理数量的伪数据直方图估算全局p值的解决方案。该方法使用受类似统计问题启发的功能形式来适合测试统计分布。我们发现,这种替代方法允许以约5%至5σ$发现阈值的精度评估全球意义。

In high Energy Physics, it is common to look for a localized deviation in data with respect to a given reference. For this task, the well known BumpHunter algorithm allows for a model-independent deviation search with the advantage of estimating a global p-value to account for the Look Elsewhere Effect. However, this method relies on the generation and scan of thousands of pseudo-data histograms sampled from the reference background. Thus, accurately calculating a global significance of $5σ$ requires a lot of computing resources. In order to speed this process and improve the algorithm, we propose in this paper a solution to estimate the global p-value using a more reasonable number of pseudo-data histograms. This method uses a functional form inspired by similar statistical problems to fit the test statistic distribution. We have found that this alternative method allows to evaluate the global significance with a precision about 5% up to the $5σ$ discovery threshold.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源