恶意软件流量分类：算法评估和自动化的地面生成管道

论文标题

恶意软件流量分类：算法评估和自动化的地面生成管道

Malware Traffic Classification: Evaluation of Algorithms and an Automated Ground-truth Generation Pipeline

论文作者

Raza, Syed Muhammad Kumail, Caballero, Juan

论文摘要

确定已加密的网络交通流中的威胁是极具挑战性的。一方面，由于现代加密算法，简单地解密流量非常困难。另一方面，通过模式匹配算法传递这样的加密流是没有用的，因为加密确保没有任何。此外，由于缺乏标记的良性和恶意软件数据集，评估此类模型也很困难。其他方法试图通过采用从流量中收集的可观察到的元数据来解决这个问题。我们尝试通过使用这些可观察到的元数据将其扩展到半监督的恶意软件分类管道来扩展这种方法。为此，我们探索和测试不同类型的聚类方法，这些方法利用了从可观察到的元数据中提取的独特和多样化的功能集。我们还提出了一个自动数据包数据标记管道，以生成基础真相数据，该数据可以用作基础线，以评估上述分类器，或者通常是任何其他检测模型。

Identifying threats in a network traffic flow which is encrypted is uniquely challenging. On one hand it is extremely difficult to simply decrypt the traffic due to modern encryption algorithms. On the other hand, passing such an encrypted stream through pattern matching algorithms is useless because encryption ensures there aren't any. Moreover, evaluating such models is also difficult due to lack of labeled benign and malware datasets. Other approaches have tried to tackle this problem by employing observable meta-data gathered from the flow. We try to augment this approach by extending it to a semi-supervised malware classification pipeline using these observable meta-data. To this end, we explore and test different kind of clustering approaches which make use of unique and diverse set of features extracted from this observable meta-data. We also, propose an automated packet data-labeling pipeline to generate ground-truth data which can serve as a base-line to evaluate the classifiers mentioned above in particular, or any other detection model in general.

下载PDF全文

下载文献需遵守相关版权规定

论文标题