论文标题
使用公制嵌入的有效恶意软件分析
Efficient Malware Analysis Using Metric Embeddings
论文作者
论文摘要
在本文中,我们探讨了公制学习将Windows PE文件嵌入低维矢量空间中的使用,以在各种应用程序中下游使用,包括恶意软件检测,家庭分类和恶意软件属性标签。具体而言,我们使用以计算昂贵的,基于拆卸的恶意功能来丰富恶意和良性PE文件的标签。使用这些功能,我们利用通过对比度损失,Spearman等级相关性及其组合训练的嵌入神经网络得出了几种不同类型的度量嵌入。然后,我们检查了在Ember和Sorel数据集上执行的各种转移任务上的性能,表明对于几个任务,低维,计算高效的度量嵌入式嵌入式嵌入式效果保持了较小的衰减,这有可能快速重新训练以显着降低存储空间的各种转移任务。最后,我们对使用我们提出的嵌入方法的实际考虑进行了研究,例如对对抗性逃避的鲁棒性和引入特定任务的辅助目标,以提高任务关键任务的绩效。
In this paper, we explore the use of metric learning to embed Windows PE files in a low-dimensional vector space for downstream use in a variety of applications, including malware detection, family classification, and malware attribute tagging. Specifically, we enrich labeling on malicious and benign PE files using computationally expensive, disassembly-based malicious capabilities. Using these capabilities, we derive several different types of metric embeddings utilizing an embedding neural network trained via contrastive loss, Spearman rank correlation, and combinations thereof. We then examine performance on a variety of transfer tasks performed on the EMBER and SOREL datasets, demonstrating that for several tasks, low-dimensional, computationally efficient metric embeddings maintain performance with little decay, which offers the potential to quickly retrain for a variety of transfer tasks at significantly reduced storage overhead. We conclude with an examination of practical considerations for the use of our proposed embedding approach, such as robustness to adversarial evasion and introduction of task-specific auxiliary objectives to improve performance on mission critical tasks.