在强大的数据流挖掘的中毒攻击下，对抗概念漂移检测

论文标题

在强大的数据流挖掘的中毒攻击下，对抗概念漂移检测

Adversarial Concept Drift Detection under Poisoning Attacks for Robust Data Stream Mining

论文作者

Korycki, Łukasz, Krawczyk, Bartosz

论文摘要

从流数据中持续学习是当代机器学习中最具挑战性的主题之一。在这个领域中，学习算法不仅必须能够处理大量快速到达数据的数量，而且还可以适应潜在的新兴变化。数据流不断发展的性质的现象称为概念漂移。尽管有许多用于检测其发生的方法，但所有方法都认为漂移与数据源的基本变化相连。但是，必须考虑模拟概念漂移的虚假数据恶意注入的可能性。这种对抗性环境假设可以通过强制适应虚假数据来损害基础分类系统，以损害基础分类系统。现有的漂移探测器无法区分真实和对抗性概念漂移。在本文中，我们提出了一个在存在对抗和中毒攻击的情况下进行稳健概念漂移检测的框架。我们介绍了两种类型的对抗概念漂移以及可训练的漂移探测器的分类学。它基于具有改进梯度计算和能量功能的增强受限的玻尔兹曼机器。我们还引入了鲁棒性的相对丧失 - 一种用于评估中毒攻击下概念漂移探测器的性能的新方法。在完全和稀疏标记的数据流上进行的广泛计算实验证明了在对抗场景中提出的漂移检测框架的高鲁棒性和功效。

Continuous learning from streaming data is among the most challenging topics in the contemporary machine learning. In this domain, learning algorithms must not only be able to handle massive volumes of rapidly arriving data, but also adapt themselves to potential emerging changes. The phenomenon of the evolving nature of data streams is known as concept drift. While there is a plethora of methods designed for detecting its occurrence, all of them assume that the drift is connected with underlying changes in the source of data. However, one must consider the possibility of a malicious injection of false data that simulates a concept drift. This adversarial setting assumes a poisoning attack that may be conducted in order to damage the underlying classification system by forcing adaptation to false data. Existing drift detectors are not capable of differentiating between real and adversarial concept drift. In this paper, we propose a framework for robust concept drift detection in the presence of adversarial and poisoning attacks. We introduce the taxonomy for two types of adversarial concept drifts, as well as a robust trainable drift detector. It is based on the augmented Restricted Boltzmann Machine with improved gradient computation and energy function. We also introduce Relative Loss of Robustness - a novel measure for evaluating the performance of concept drift detectors under poisoning attacks. Extensive computational experiments, conducted on both fully and sparsely labeled data streams, prove the high robustness and efficacy of the proposed drift detection framework in adversarial scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题