论文标题

柑橘不良:通过模型距离降低对抗成本

Bad Citrus: Reducing Adversarial Costs with Model Distances

论文作者

Severi, Giorgio, Pearce, Will, Oprea, Alina

论文摘要

Jia等人的最新工作表明了使用称为石灰的模型解释技术有效地计算重量空间中成对模型距离的可能性。此方法需要仅查询对正在检查的两个模型的访问。我们认为,对手可以利用这种见解,以减少针对部署模型发起逃避运动的净成本(查询数)。我们表明,对抗转移的成功率与受害者模型与用于生成回避样本的替代物之间的距离之间存在很强的负相关。因此,我们提出并评估一种方法,通过找到对抗转移的最接近的替代模型来降低对抗成本的方法。

Recent work by Jia et al., showed the possibility of effectively computing pairwise model distances in weight space, using a model explanation technique known as LIME. This method requires query-only access to the two models under examination. We argue this insight can be leveraged by an adversary to reduce the net cost (number of queries) of launching an evasion campaign against a deployed model. We show that there is a strong negative correlation between the success rate of adversarial transfer and the distance between the victim model and the surrogate used to generate the evasive samples. Thus, we propose and evaluate a method to reduce adversarial costs by finding the closest surrogate model for adversarial transfer.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源