论文标题
将AUC-ROC归因于分析二进制分类器性能
Attributing AUC-ROC to Analyze Binary Classifier Performance
论文作者
论文摘要
接收器操作特征曲线(AUC-ROC)下的区域是二进制分类器的流行评估度量。在本文中,我们讨论了将AUC-ROC沿人解剖维度分割的技术。 AUC-ROC不是数据样本上的加性/线性函数,因此,这种分割的整体AUC-ROC与制定数据段的AUC-ROC不同。要细分整个AUC-ROC,我们必须首先解决\ emph {属性}问题,以识别单个示例的信用。 我们观察到,虽然AUC-ROC(尽管非线性过于示例)是示例的\ emph {pairs}的线性。该观察结果导致了示例(示例属性)的简单,有效的归因技术,以及成对的示例(对归因)。我们通过使树预测属性来自动使用决策树切割这些属性;我们使用诚实估计的概念以及t检验来减轻错误发现。 我们对方法的实验表明,下等模型可以胜过劣质模型自己的训练目标的卓越模型(训练以优化不同的训练目标),这是古哈特定律的表现。相反,AUC归因可以合理地比较。示例归因可用于切片此比较。配对归因用于对成对的项目分类(一个正面标记,一个负标记,一个模型很难分开。这些类别确定了分类器和净空的决策边界以改善AUC。
Area Under the Receiver Operating Characteristic Curve (AUC-ROC) is a popular evaluation metric for binary classifiers. In this paper, we discuss techniques to segment the AUC-ROC along human-interpretable dimensions. AUC-ROC is not an additive/linear function over the data samples, therefore such segmenting the overall AUC-ROC is different from tabulating the AUC-ROC of data segments. To segment the overall AUC-ROC, we must first solve an \emph{attribution} problem to identify credit for individual examples. We observe that AUC-ROC, though non-linear over examples, is linear over \emph{pairs} of examples. This observation leads to a simple, efficient attribution technique for examples (example attributions), and for pairs of examples (pair attributions). We automatically slice these attributions using decision trees by making the tree predict the attributions; we use the notion of honest estimates along with a t-test to mitigate false discovery. Our experiments with the method show that an inferior model can outperform a superior model (trained to optimize a different training objective) on the inferior model's own training objective, a manifestation of Goodhart's Law. In contrast, AUC attributions enable a reasonable comparison. Example attributions can be used to slice this comparison. Pair attributions are used to categorize pairs of items -- one positively labeled and one negatively -- that the model has trouble separating. These categories identify the decision boundary of the classifier and the headroom to improve AUC.