论文标题
序列保留网络流量的生成
Sequence Preserving Network Traffic Generation
论文作者
论文摘要
我们提供网络流量生成器(NTG),这是一个框架,用于扰动记录的网络流量,目的是生成各种但现实的背景流量,以用于网络模拟和企业环境中的何种分析。该框架保留了企业中记录的原始流量的许多特征以及网络活动序列。使用提出的框架,使用200个交叉协议功能对原始交通流量进行了介绍。流量聚集成IP对之间的数据包流,并将其聚集成类似网络活动的组。然后提取网络活动的序列。我们检查了提取活动序列的两种方法:马尔可夫模型和神经语言模型。最后,使用提取的模型生成新的流量。我们开发了该框架的原型,并根据两个实际的网络流量收集进行了广泛的实验。假设检验用于检查原始特征和生成特征的分布之间的差异,表明保留了30-100%的提取特征。原始和生成流量中网络活动序列中N-Gram困惑之间的小差异表明网络活动的序列保存得很好。
We present the Network Traffic Generator (NTG), a framework for perturbing recorded network traffic with the purpose of generating diverse but realistic background traffic for network simulation and what-if analysis in enterprise environments. The framework preserves many characteristics of the original traffic recorded in an enterprise, as well as sequences of network activities. Using the proposed framework, the original traffic flows are profiled using 200 cross-protocol features. The traffic is aggregated into flows of packets between IP pairs and clustered into groups of similar network activities. Sequences of network activities are then extracted. We examined two methods for extracting sequences of activities: a Markov model and a neural language model. Finally, new traffic is generated using the extracted model. We developed a prototype of the framework and conducted extensive experiments based on two real network traffic collections. Hypothesis testing was used to examine the difference between the distribution of original and generated features, showing that 30-100\% of the extracted features were preserved. Small differences between n-gram perplexities in sequences of network activities in the original and generated traffic, indicate that sequences of network activities were well preserved.