轻型配方来训练强大的视觉变压器

论文标题

轻型配方来训练强大的视觉变压器

A Light Recipe to Train Robust Vision Transformers

论文作者

Debenedetti, Edoardo, Sehwag, Vikash, Mittal, Prateek

论文摘要

在本文中，我们询问视觉变形金刚（VIT）是否可以作为改善机器学习模型对抗逃避攻击的对抗性鲁棒性的基础架构。尽管较早的作品集中在改善卷积神经网络上，但我们表明VIT也非常适合对抗训练以实现竞争性能。我们使用自定义的对抗训练配方实现了这一目标，该配方是在Imagenet数据集的一部分上使用严格的消融研究发现的。与卷积相比，VIT的规范培训配方建议强大的数据增强，以弥补注意力模块的视力归纳偏见。我们表明，该食谱在用于对抗训练时可实现次优性能。相比之下，我们发现省略所有重型数据增强，并添加一些额外的零件（$ \ varepsilon $ -Warmup和更大的重量衰减），从而大大提高了强大的Vits的性能。我们表明，我们的配方概括了在完整的Imagenet-1k上的不同类别的VIT体系结构和大规模模型。此外，调查了模型鲁棒性的原因，我们表明，在使用我们的食谱时，在训练过程中产生强烈的攻击更加容易，这会在测试时提高鲁棒性。最后，我们通过提出一种量化对抗性扰动的语义性质并强调其与模型鲁棒性的相关性来进一步研究对抗训练的结果。总体而言，我们建议社区应避免将VIT的规范培训食谱转换为在对抗培训的情况下进行强大的培训和重新思考常见的培训选择。

In this paper, we ask whether Vision Transformers (ViTs) can serve as an underlying architecture for improving the adversarial robustness of machine learning models against evasion attacks. While earlier works have focused on improving Convolutional Neural Networks, we show that also ViTs are highly suitable for adversarial training to achieve competitive performance. We achieve this objective using a custom adversarial training recipe, discovered using rigorous ablation studies on a subset of the ImageNet dataset. The canonical training recipe for ViTs recommends strong data augmentation, in part to compensate for the lack of vision inductive bias of attention modules, when compared to convolutions. We show that this recipe achieves suboptimal performance when used for adversarial training. In contrast, we find that omitting all heavy data augmentation, and adding some additional bag-of-tricks ($\varepsilon$-warmup and larger weight decay), significantly boosts the performance of robust ViTs. We show that our recipe generalizes to different classes of ViT architectures and large-scale models on full ImageNet-1k. Additionally, investigating the reasons for the robustness of our models, we show that it is easier to generate strong attacks during training when using our recipe and that this leads to better robustness at test time. Finally, we further study one consequence of adversarial training by proposing a way to quantify the semantic nature of adversarial perturbations and highlight its correlation with the robustness of the model. Overall, we recommend that the community should avoid translating the canonical training recipes in ViTs to robust training and rethink common training choices in the context of adversarial training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题