使用显着成像网的核心风险最小化

论文标题

使用显着成像网的核心风险最小化

Core Risk Minimization using Salient ImageNet

论文作者

Singla, Sahil, Moayeri, Mazda, Feizi, Soheil

论文摘要

深度神经网络在现实世界中可能是不可靠的，尤其是当他们大量使用虚假特征进行预测时。最近，Singla＆Feizi（2022）通过注释和定位232类ImageNet类样品的核心和虚假特征，引入了显着的Imagenet数据集。虽然该数据集可用于评估预算模型对伪造特征的依赖，但其小尺寸限制了其对训练模型的有用性。在这项工作中，我们首先介绍了所有1000个ImageNet类的显着Imagenet-1M数据集，其本地化核心和虚假功能。使用此数据集，我们首先评估了几种成像网列出的模型（总计42个）对虚假特征的依赖，并观察到：（i）与Convnets相比，变压器对虚假特征更敏感，（ii）零发夹变压器对虚假特征非常敏感。接下来，我们引入了一种称为核心风险最小化（CORM）的新学习范式，其目标可确保模型使用其核心特征预测一类。与通过经验风险最小化训练的模型相比，我们评估了解决CORM的不同计算方法，以解决CORM并实现明显更高（+12％）核心精度（当非核心区域损坏的非核心区域时的精度），而无需下降。

Deep neural networks can be unreliable in the real world especially when they heavily use spurious features for their predictions. Recently, Singla & Feizi (2022) introduced the Salient Imagenet dataset by annotating and localizing core and spurious features of ~52k samples from 232 classes of Imagenet. While this dataset is useful for evaluating the reliance of pretrained models on spurious features, its small size limits its usefulness for training models. In this work, we first introduce the Salient Imagenet-1M dataset with more than 1 million soft masks localizing core and spurious features for all 1000 Imagenet classes. Using this dataset, we first evaluate the reliance of several Imagenet pretrained models (42 total) on spurious features and observe that: (i) transformers are more sensitive to spurious features compared to Convnets, (ii) zero-shot CLIP transformers are highly susceptible to spurious features. Next, we introduce a new learning paradigm called Core Risk Minimization (CoRM) whose objective ensures that the model predicts a class using its core features. We evaluate different computational approaches for solving CoRM and achieve significantly higher (+12%) core accuracy (accuracy when non-core regions corrupted using noise) with no drop in clean accuracy compared to models trained via Empirical Risk Minimization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题