论文标题
部署树合奏中的对抗示例检测
Adversarial Example Detection in Deployed Tree Ensembles
论文作者
论文摘要
树合奏是广泛使用的强大模型。但是,它们容易受到对抗性示例的影响,这些例子是故意构建的,目的是引起该模型的错误预测。这可以降低性能并侵蚀用户对模型的信任。通常,方法试图通过验证学习过程或鲁棒性学习过程来减轻此问题。我们采用另一种方法,并尝试在剥离后环境中检测对抗性例子。我们为此任务提供了一种新颖的方法,该方法是通过分析看不见的示例的输出配置来工作的,这是一组集合的组成树做出的预测。我们的方法与任何添加树的合奏一起使用,不需要训练单独的模型。我们评估了三个不同的树合奏学习者的方法。我们从经验上表明,我们的方法目前是树形合奏的最佳对抗检测方法。
Tree ensembles are powerful models that are widely used. However, they are susceptible to adversarial examples, which are examples that purposely constructed to elicit a misprediction from the model. This can degrade performance and erode a user's trust in the model. Typically, approaches try to alleviate this problem by verifying how robust a learned ensemble is or robustifying the learning process. We take an alternative approach and attempt to detect adversarial examples in a post-deployment setting. We present a novel method for this task that works by analyzing an unseen example's output configuration, which is the set of predictions made by an ensemble's constituent trees. Our approach works with any additive tree ensemble and does not require training a separate model. We evaluate our approach on three different tree ensemble learners. We empirically show that our method is currently the best adversarial detection method for tree ensembles.