通过其稳定性分析的多种扰动的自适应平滑性加权对抗训练

论文标题

通过其稳定性分析的多种扰动的自适应平滑性加权对抗训练

Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis

论文作者

Xiao, Jiancong, Qin, Zeyu, Fan, Yanbo, Wu, Baoyuan, Wang, Jue, Luo, Zhi-Quan

论文摘要

对抗训练（AT）已被证明是针对对抗性例子的最有效方法之一。尽管大多数现有的作品都以单一类型的扰动为重点，例如$ \ ell_ \ infty $攻击），但DNNS面临着不同类型的对抗性示例的威胁。因此，提出了针对多种扰动的对抗训练（ATMP），以将对抗性的鲁棒性概括为不同的扰动类型（以$ \ ell_1 $，$ \ ell_2 $，$ \ ell_ \ ell_ \ ell_ \ elfty $ norm-norm-norm-norm-norm-norm-norm-norm-norm-norm-norm-of-norm-norm-norm-norm-norm-of-norm-norm-norm-nord-of-norm-norm-of-norm-norm-of-norm-norm-bailed-norm-ofdund tottertations）。但是，所得模型在不同攻击之间表现出权衡。同时，没有对ATMP进行理论分析，从而限制了其进一步的发展。在本文中，我们首先提供ATMP的平滑度分析，并表明$ \ ell_1 $，$ \ ell_2 $和$ \ ell_ \ ell_ \ infty $ aberversaries对ATMP损失功能的平滑性提供了不同的贡献。基于此，我们开发了基于稳定性的过度风险界限，并提出了适应性平滑性加权对抗性训练，以实现多种扰动。从理论上讲，我们的算法会产生更好的界限。从经验上讲，我们在CIFAR10和CIFAR100上的实验实现了最新的性能，以混合多种扰动攻击。

Adversarial Training (AT) has been demonstrated as one of the most effective methods against adversarial examples. While most existing works focus on AT with a single type of perturbation e.g., the $\ell_\infty$ attacks), DNNs are facing threats from different types of adversarial examples. Therefore, adversarial training for multiple perturbations (ATMP) is proposed to generalize the adversarial robustness over different perturbation types (in $\ell_1$, $\ell_2$, and $\ell_\infty$ norm-bounded perturbations). However, the resulting model exhibits trade-off between different attacks. Meanwhile, there is no theoretical analysis of ATMP, limiting its further development. In this paper, we first provide the smoothness analysis of ATMP and show that $\ell_1$, $\ell_2$, and $\ell_\infty$ adversaries give different contributions to the smoothness of the loss function of ATMP. Based on this, we develop the stability-based excess risk bounds and propose adaptive smoothness-weighted adversarial training for multiple perturbations. Theoretically, our algorithm yields better bounds. Empirically, our experiments on CIFAR10 and CIFAR100 achieve the state-of-the-art performance against the mixture of multiple perturbations attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题