论文标题

要理解一阶对手的动态

Towards Understanding the Dynamics of the First-Order Adversaries

论文作者

Deng, Zhun, He, Hangfeng, Huang, Jiaoyang, Su, Weijie J.

论文摘要

神经网络的公认弱点是它们易受对投入的对抗性扰动的脆弱性。为了提高这些模型的鲁棒性,最受欢迎的防御机制之一是,使用投影梯度上升,以最大程度地利用受约束的扰动(或称为对手)的损失,并最大程度地减少重量。在本文中,我们分析了最大化步骤的动力学,以了解该防御机制的实验观察到的有效性。具体而言,我们研究了具有二次损失的两层神经网络的对手的非cave景观。我们的主要结果证明,预计的梯度上升是在多项式迭代中发现了这个非c骨问题的局部最大值,概率很高。据我们所知,这是对一阶对手进行融合分析的第一项工作。此外,我们的分析表明,在对抗性训练的最初阶段,输入的规模在某种意义上很重要,即较小的输入量表会导致对抗性训练的更快收敛和“更常规的”景观。最后,我们表明这些理论发现与一系列实验非常吻合。

An acknowledged weakness of neural networks is their vulnerability to adversarial perturbations to the inputs. To improve the robustness of these models, one of the most popular defense mechanisms is to alternatively maximize the loss over the constrained perturbations (or called adversaries) on the inputs using projected gradient ascent and minimize over weights. In this paper, we analyze the dynamics of the maximization step towards understanding the experimentally observed effectiveness of this defense mechanism. Specifically, we investigate the non-concave landscape of the adversaries for a two-layer neural network with a quadratic loss. Our main result proves that projected gradient ascent finds a local maximum of this non-concave problem in a polynomial number of iterations with high probability. To our knowledge, this is the first work that provides a convergence analysis of the first-order adversaries. Moreover, our analysis demonstrates that, in the initial phase of adversarial training, the scale of the inputs matters in the sense that a smaller input scale leads to faster convergence of adversarial training and a "more regular" landscape. Finally, we show that these theoretical findings are in excellent agreement with a series of experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源