侦察员：自我意识的判别反事实解释

论文标题

侦察员：自我意识的判别反事实解释

SCOUT: Self-aware Discriminant Counterfactual Explanations

论文作者

Wang, Pei, Vasconcelos, Nuno

论文摘要

考虑了反事实视觉解释的问题。引入了一个新的判别解释家庭。这些产生的热图将高分归因于分类器预测的图像区域，但不是计数类别的信息。他们将基于单个热图的归属解释连接到反事实解释，这些解释既是预测的类和计数类别。后者被证明是可以通过两个判别解释以及相反的班级对来计算的。有人认为，自我意识，即产生分类置信度得分的能力，对于判别解释的计算很重要，这些解释试图识别易于区分预测和反阶级的区域。这表明通过三个归因图的组合来计算判别解释。由此产生的反事实解释是免费的，因此比以前的方法快得多。为了解决他们评估的困难，还提出了一项替代任务和一组定量指标。该协议下的实验表明，对于流行的网络，提出的反事实解释在实现更高速度的同时，表现出色。在人工学习的机器教学实验中，它们还被证明可以将平均学生准确性从机会水平提高到95 \％。

The problem of counterfactual visual explanations is considered. A new family of discriminant explanations is introduced. These produce heatmaps that attribute high scores to image regions informative of a classifier prediction but not of a counter class. They connect attributive explanations, which are based on a single heat map, to counterfactual explanations, which account for both predicted class and counter class. The latter are shown to be computable by combination of two discriminant explanations, with reversed class pairs. It is argued that self-awareness, namely the ability to produce classification confidence scores, is important for the computation of discriminant explanations, which seek to identify regions where it is easy to discriminate between prediction and counter class. This suggests the computation of discriminant explanations by the combination of three attribution maps. The resulting counterfactual explanations are optimization free and thus much faster than previous methods. To address the difficulty of their evaluation, a proxy task and set of quantitative metrics are also proposed. Experiments under this protocol show that the proposed counterfactual explanations outperform the state of the art while achieving much higher speeds, for popular networks. In a human-learning machine teaching experiment, they are also shown to improve mean student accuracy from chance level to 95\%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题