论文标题

SmartValidator:自动识别和网络威胁数据分类的框架

SmartValidator: A Framework for Automatic Identification and Classification of Cyber Threat Data

论文作者

Islam, Chadni, Babar, M. Ali, Croft, Roland, Janicke, Helge

论文摘要

安全操作中心(SOC)使用了各种各样的网络威胁信息(CTI)来执行安全事件和警报的验证。安全专家根据CTI手动定义不同类型的规则和脚本以执行验证任务。这些规则和脚本需要由于不断发展的威胁,改变SOC的要求和CTI的动态性质而不断更新。更新规则和脚本的手动过程延迟了对攻击的响应。为了减轻人类专家的负担并加速反应,我们提出了一种新颖的人工智能(AI)框架SmartValidator。 SmartValidator利用机器学习(ML)技术来自动验证警报。它由三层组成,以执行数据收集,模型构建和警报验证的任务。它将验证任务视为分类问题。我们建议我们根据SOC的要求和CTI自动构建验证模型,而不是为所有可能的要求构建和保存模型。我们建立了一种具有八种ML算法,两种功能工程技术和18条要求的概念证明(POC)系统,以研究SmartValidator的有效性和效率。评估结果表明,当自动构建用于对网络威胁数据进行分类的预测模型时,75 \%的F1得分高于0.8,这表明POC在现实世界中使用了足够的POC。结果进一步表明,预测模型的动态构建所需的99 \%的模型要比满足所有可能要求的前构建模型要少。可以根据CTI和SOC的偏好加速和自动化警报和事件的验证框架。

A wide variety of Cyber Threat Information (CTI) is used by Security Operation Centres (SOCs) to perform validation of security incidents and alerts. Security experts manually define different types of rules and scripts based on CTI to perform validation tasks. These rules and scripts need to be updated continuously due to evolving threats, changing SOCs' requirements and dynamic nature of CTI. The manual process of updating rules and scripts delays the response to attacks. To reduce the burden of human experts and accelerate response, we propose a novel Artificial Intelligence (AI) based framework, SmartValidator. SmartValidator leverages Machine Learning (ML) techniques to enable automated validation of alerts. It consists of three layers to perform the tasks of data collection, model building and alert validation. It projects the validation task as a classification problem. Instead of building and saving models for all possible requirements, we propose to automatically construct the validation models based on SOC's requirements and CTI. We built a Proof of Concept (PoC) system with eight ML algorithms, two feature engineering techniques and 18 requirements to investigate the effectiveness and efficiency of SmartValidator. The evaluation results showed that when prediction models were built automatically for classifying cyber threat data, the F1-score of 75\% of the models were above 0.8, which indicates adequate performance of the PoC for use in a real-world organization. The results further showed that dynamic construction of prediction models required 99\% less models to be built than pre-building models for all possible requirements. The framework can be followed by various industries to accelerate and automate the validation of alerts and incidents based on their CTI and SOC's preferences.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源