论文标题
通过可转移的先验学习黑盒攻击者并查询反馈
Learning Black-Box Attackers with Transferable Priors and Query Feedback
论文作者
论文摘要
本文解决了具有挑战性的黑盒对抗攻击问题,在该问题中,只有受害者模型的分类信心。受不同视力模型之间视觉显着性的一致性的启发,替代模型有望通过可传递性提高攻击性能。通过结合基于可转移性的黑框攻击,我们使用替代模型提出了一种令人惊讶的简单基线方法(命名为SIMBA ++),该模型的表现明显优于几种最先进的方法。此外,为了有效利用查询反馈,我们在新的学习方案中更新了替代模型,称为高阶梯度近似(HOGA)。通过构建高阶梯度计算图,我们更新了替代模型,以近似向前和向后通过的受害者模型。 SIMBA ++和HOGA导致可学习的黑盒攻击(LEBA),该攻击(LEBA)通过相当大的利润超过了先前的艺术状态:拟议的LEBA大大降低了查询,同时在广泛的Imagenet实验中保持较高的攻击成功率接近100%,包括攻击视觉基准和防御模型。代码是https://github.com/trustworthydl/leba的开源。
This paper addresses the challenging black-box adversarial attack problem, where only classification confidence of a victim model is available. Inspired by consistency of visual saliency between different vision models, a surrogate model is expected to improve the attack performance via transferability. By combining transferability-based and query-based black-box attack, we propose a surprisingly simple baseline approach (named SimBA++) using the surrogate model, which significantly outperforms several state-of-the-art methods. Moreover, to efficiently utilize the query feedback, we update the surrogate model in a novel learning scheme, named High-Order Gradient Approximation (HOGA). By constructing a high-order gradient computation graph, we update the surrogate model to approximate the victim model in both forward and backward pass. The SimBA++ and HOGA result in Learnable Black-Box Attack (LeBA), which surpasses previous state of the art by considerable margins: the proposed LeBA significantly reduces queries, while keeping higher attack success rates close to 100% in extensive ImageNet experiments, including attacking vision benchmarks and defensive models. Code is open source at https://github.com/TrustworthyDL/LeBA.