Elodi：集合logit差异抑制积极训练

论文标题

Elodi：集合logit差异抑制积极训练

ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent Training

论文作者

Zhao, Yue, Shen, Yantao, Xiong, Yuanjun, Yang, Shuo, Xia, Wei, Tu, Zhuowen, Schiele, Bernt, Soatto, Stefano

论文摘要

负面的翻转是当更新传统模型时在分类系统中引入的错误。降低负面翻转率（NFR）的现有方法要么以牺牲整体准确性来模仿旧模型或使用合奏，从而牺牲了整体准确性，或者使用合奏，这会超过推断成本。我们分析了合奏在减少NFR中的作用，并观察到它们去除了通常不接近决策边界的负面翻转，但通常在其逻辑之间表现出较大的偏差。基于观察结果，我们提出了一种称为集合logit差异抑制（ELODI）的方法，以训练一个以单个模型的推理成本来训练在错误率和NFR中达到paragon性能的分类系统。该方法将一个均匀的合奏提炼成单个学生模型，该模型用于更新分类系统。 Elodi还引入了广义蒸馏目标，Logit差异抑制（LDI），该目标仅惩罚具有最高logit值的类的logit差异。在多个图像分类基准上，具有ELODI的模型更新表明了卓越的准确性保留和NFR降低。

Negative flips are errors introduced in a classification system when a legacy model is updated. Existing methods to reduce the negative flip rate (NFR) either do so at the expense of overall accuracy by forcing a new model to imitate the old models, or use ensembles, which multiply inference cost prohibitively. We analyze the role of ensembles in reducing NFR and observe that they remove negative flips that are typically not close to the decision boundary, but often exhibit large deviations in the distance among their logits. Based on the observation, we present a method, called Ensemble Logit Difference Inhibition (ELODI), to train a classification system that achieves paragon performance in both error rate and NFR, at the inference cost of a single model. The method distills a homogeneous ensemble to a single student model which is used to update the classification system. ELODI also introduces a generalized distillation objective, Logit Difference Inhibition (LDI), which only penalizes the logit difference of a subset of classes with the highest logit values. On multiple image classification benchmarks, model updates with ELODI demonstrate superior accuracy retention and NFR reduction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题