论文标题
可解释域概括的可深入分类模型
Explainable Deep Classification Models for Domain Generalization
论文作者
论文摘要
通常,AI模型被认为可以权衡解释性,以降低准确性。我们制定了一种培训策略,不仅会导致更容易解释的对象分类的AI系统,而且因此没有可感知的准确性降解。解释被定义为视觉证据的区域,深层分类网络做出了决定。这是显着图的形式表示的,该图传达了每个像素对网络决定的贡献。我们的培训策略强制执行基于显着性的反馈,以鼓励模型专注于直接与地面对象相对应的图像区域。我们使用自动指标和人类判断来量化解释性。我们提出解释性,作为弥合不同域之间视觉语义差距的一种手段,其中模型解释被用作将特定信息从其他相关特征中解脱出来的手段。我们证明,这导致对新领域的概括改进,而不会阻碍原始域上的性能。
Conventionally, AI models are thought to trade off explainability for lower accuracy. We develop a training strategy that not only leads to a more explainable AI system for object classification, but as a consequence, suffers no perceptible accuracy degradation. Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision. This is represented in the form of a saliency map conveying how much each pixel contributed to the network's decision. Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object. We quantify explainability using an automated metric, and using human judgement. We propose explainability as a means for bridging the visual-semantic gap between different domains where model explanations are used as a means of disentagling domain specific information from otherwise relevant features. We demonstrate that this leads to improved generalization to new domains without hindering performance on the original domain.