论文标题
通过广义随机优势对分类器的统计比较
Statistical Comparisons of Classifiers by Generalized Stochastic Dominance
论文作者
论文摘要
尽管对于开发机器学习算法的开发是一个至关重要的问题,但关于如何将分类器与多个标准相比,如何将分类器在多个数据集上进行比较。每个比较框架都面临着(至少)三个基本挑战:质量标准的多样性,数据集的多样性和数据集选择的随机性。在本文中,我们通过采用决策理论的最新发展,为生动的辩论增添了新的观点。基于所谓的偏好系统,我们的框架通过广义的随机优势概念对分类器进行排名,该概念强大地绕过了繁琐的,通常甚至是自相矛盾的,对聚集体的依赖。此外,我们表明,可以通过解决易于手柄的线性程序和统计测试的经过适应的两样本观测随机测试来实现广义随机优势。这确实是一个有力的框架,用于同时在多个数据集上对多个数据集进行分类器的统计比较。我们在仿真研究和一组标准基准数据集中说明和调查了我们的框架。
Although being a crucial question for the development of machine learning algorithms, there is still no consensus on how to compare classifiers over multiple data sets with respect to several criteria. Every comparison framework is confronted with (at least) three fundamental challenges: the multiplicity of quality criteria, the multiplicity of data sets and the randomness of the selection of data sets. In this paper, we add a fresh view to the vivid debate by adopting recent developments in decision theory. Based on so-called preference systems, our framework ranks classifiers by a generalized concept of stochastic dominance, which powerfully circumvents the cumbersome, and often even self-contradictory, reliance on aggregates. Moreover, we show that generalized stochastic dominance can be operationalized by solving easy-to-handle linear programs and moreover statistically tested employing an adapted two-sample observation-randomization test. This yields indeed a powerful framework for the statistical comparison of classifiers over multiple data sets with respect to multiple quality criteria simultaneously. We illustrate and investigate our framework in a simulation study and with a set of standard benchmark data sets.