简单分类器的鲁棒性

论文标题

简单分类器的鲁棒性

Robustness from Simple Classifiers

论文作者

Qian, Sharon, Kalimeris, Dimitris, Kaplun, Gal, Singer, Yaron

论文摘要

尽管深层神经网络在众多应用领域取得了巨大的成功，但已经表明，这种模型并不强大，即它们容易受到输入的小对抗性扰动的影响。尽管已经完成了为什么会发生这种扰动或如何成功防御它们的广泛工作，但我们仍然对鲁棒性没有完全的理解。在这项工作中，我们研究了鲁棒性与简单性之间的联系。我们发现，通过减少输出类的数量而形成的简单分类器不易受到对抗扰动的影响。因此，我们证明将复杂的多类模型分解为二进制模型的聚合可以增强鲁棒性。在不同的数据集和模型架构中，这种行为是一致的，可以与已知的防御技术（例如对抗训练）结合使用。此外，我们提供了进一步的证据，证明了标准学习方案和强大的学习制度之间的断开连接。特别是，我们表明精细的标签信息可以帮助标准准确性，但会损害鲁棒性。

Despite the vast success of Deep Neural Networks in numerous application domains, it has been shown that such models are not robust i.e., they are vulnerable to small adversarial perturbations of the input. While extensive work has been done on why such perturbations occur or how to successfully defend against them, we still do not have a complete understanding of robustness. In this work, we investigate the connection between robustness and simplicity. We find that simpler classifiers, formed by reducing the number of output classes, are less susceptible to adversarial perturbations. Consequently, we demonstrate that decomposing a complex multiclass model into an aggregation of binary models enhances robustness. This behavior is consistent across different datasets and model architectures and can be combined with known defense techniques such as adversarial training. Moreover, we provide further evidence of a disconnect between standard and robust learning regimes. In particular, we show that elaborate label information can help standard accuracy but harm robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题