论文标题
一种混合主动 - 非平稳数据流分类的混合活动方法
A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification
论文作者
论文摘要
在实际应用程序中,生成数据的过程可能会遭受非组织效应的影响(例如,由于季节性,影响传感器或执行器的故障以及用户行为的变化)。这些变化通常称为概念漂移,可能会引起对训练有素的学习模型的严重(潜在灾难性)影响,这些学习模型随着时间的流逝而过时,无法解决手头的任务。在存在概念漂移的存在中学习旨在设计机器和深度学习模型,这些模型能够跟踪和适应概念漂移。通常,处理概念漂移的技术要么是活跃或被动的,而且传统上,这些技术被认为是互斥的。主动技术使用明确的漂移检测机制,并在检测概念漂移时重新培训学习算法。被动技术使用隐式方法来处理漂移,并使用增量学习不断更新模型。与文献中的存在不同,我们提出了一种混合替代方案,该替代方法将两种方法融合在一起,因此利用了它们的优势。提出的称为混合自适应重新平衡(Hareba)的方法在学习质量和速度方面显着优于强大的基准和最先进的方法。我们在严重的阶级不平衡水平下也有效地实验。
In real-world applications, the process generating the data might suffer from nonstationary effects (e.g., due to seasonality, faults affecting sensors or actuators, and changes in the users' behaviour). These changes, often called concept drift, might induce severe (potentially catastrophic) impacts on trained learning models that become obsolete over time, and inadequate to solve the task at hand. Learning in presence of concept drift aims at designing machine and deep learning models that are able to track and adapt to concept drift. Typically, techniques to handle concept drift are either active or passive, and traditionally, these have been considered to be mutually exclusive. Active techniques use an explicit drift detection mechanism, and re-train the learning algorithm when concept drift is detected. Passive techniques use an implicit method to deal with drift, and continually update the model using incremental learning. Differently from what present in the literature, we propose a hybrid alternative which merges the two approaches, hence, leveraging on their advantages. The proposed method called Hybrid-Adaptive REBAlancing (HAREBA) significantly outperforms strong baselines and state-of-the-art methods in terms of learning quality and speed; we experiment how it is effective under severe class imbalance levels too.