论文标题
基于强化的分类学习学习可能会产生更好的概括和对抗性的准确性
Reinforcement Based Learning on Classification Task Could Yield Better Generalization and Adversarial Accuracy
论文作者
论文摘要
深度学习在计算机视觉中已经变得很流行,在各种视觉任务中,大多数人都在接近或高于人类水平的表现中。但是最近的工作还表明,这些深层神经网络非常容易受到对抗性示例的影响(对抗性示例 - 对模型的输入,该模型与原始数据自然相似,但愚弄了该模型将其分类为错误类)。人类对这种扰动非常强大。一个可能的原因可能是,人类没有根据“目标标签”和“预测标签”之间的错误学会进行分类,但可能是由于他们在预测上获得的增援而造成的。在这项工作中,我们提出了一种新颖的方法,以训练有关图像分类任务的深度学习模型。我们使用了基于奖励的优化功能,类似于用于增强学习的香草策略梯度方法,用于训练我们的模型,而不是常规的跨膜损失。在CIFAR10数据集上进行的经验评估表明,我们的方法比使用交叉渗透损失函数训练的同一模型体系结构(在对抗性训练上)学习了更强大的分类器。同时,我们的方法在测试准确性的差异和火车准确性差异$ <2 \%$的情况下显示出更好的概括,与跨凝集层相比,大多数情况下,大多数时间的差异仍然是$> 2 \%$。
Deep Learning has become interestingly popular in computer vision, mostly attaining near or above human-level performance in various vision tasks. But recent work has also demonstrated that these deep neural networks are very vulnerable to adversarial examples (adversarial examples - inputs to a model which are naturally similar to original data but fools the model in classifying it into a wrong class). Humans are very robust against such perturbations; one possible reason could be that humans do not learn to classify based on an error between "target label" and "predicted label" but possibly due to reinforcements that they receive on their predictions. In this work, we proposed a novel method to train deep learning models on an image classification task. We used a reward-based optimization function, similar to the vanilla policy gradient method used in reinforcement learning, to train our model instead of conventional cross-entropy loss. An empirical evaluation on the cifar10 dataset showed that our method learns a more robust classifier than the same model architecture trained using cross-entropy loss function (on adversarial training). At the same time, our method shows a better generalization with the difference in test accuracy and train accuracy $< 2\%$ for most of the time compared to the cross-entropy one, whose difference most of the time remains $> 2\%$.