用于自动选择半监督数据集异常检测器的元学习

论文标题

用于自动选择半监督数据集异常检测器的元学习

Meta-Learning for Automated Selection of Anomaly Detectors for Semi-Supervised Datasets

论文作者

Schubert, David, Gupta, Pritha, Wever, Marcel

论文摘要

在异常检测中，一个突出的任务是诱导模型以识别仅基于正常数据学习的异常。通常，人们有兴趣找到正确识别异常的异常检测器，即不属于正常类的数据点，而不会引起过多的错误警报。最适合的异常检测器取决于手头的数据集，因此需要量身定制。可以通过基于混淆的指标（例如Matthews相关系数（MCC））来评估异常检测器的质量。但是，由于在训练中仅在半监督的设置中可用正常数据，因此无法访问此类指标。为了促进自动化探测器的自动化机器学习，我们建议使用元学习来预测基于只能使用正常数据计算的指标的MCC分数。考虑到超量和假阳性速率作为元用力，可以获得首先有希望的结果。

In anomaly detection, a prominent task is to induce a model to identify anomalies learned solely based on normal data. Generally, one is interested in finding an anomaly detector that correctly identifies anomalies, i.e., data points that do not belong to the normal class, without raising too many false alarms. Which anomaly detector is best suited depends on the dataset at hand and thus needs to be tailored. The quality of an anomaly detector may be assessed via confusion-based metrics such as the Matthews correlation coefficient (MCC). However, since during training only normal data is available in a semi-supervised setting, such metrics are not accessible. To facilitate automated machine learning for anomaly detectors, we propose to employ meta-learning to predict MCC scores based on metrics that can be computed with normal data only. First promising results can be obtained considering the hypervolume and the false positive rate as meta-features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题