论文标题

CODEATTACK:针对预训练的编程语言模型的基于代码的对抗攻击

CodeAttack: Code-Based Adversarial Attacks for Pre-trained Programming Language Models

论文作者

Jha, Akshita, Reddy, Chandan K.

论文摘要

预训练的编程语言(PL)模型(例如Codet5,Codebert,GraphCodebert等)有可能自动化涉及代码理解和代码生成的软件工程任务。但是,这些模型在代码的自然渠道中运行,即它们主要关注人类对代码的理解。它们对输入的变化不强大,因此,它们可能易受自然通道中对抗性攻击的影响。我们提出了一种简单而有效的黑盒攻击模型,它使用代码结构来生成有效,有效和不可察觉的对抗代码样本,并演示了最先进的PL模型的脆弱性,以对特定于代码的对抗性攻击。我们在几个代码代码(翻译和维修)和代码-NL(摘要)任务上评估CodeAttack的可传递性。 CODEATTACK优于最先进的对抗性NLP攻击模型,以实现最佳的整体性能下降,同时更有效,不可察觉,一致和流利。可以在https://github.com/reddy-lab-code-research/codeattack上找到该代码。

Pre-trained programming language (PL) models (such as CodeT5, CodeBERT, GraphCodeBERT, etc.,) have the potential to automate software engineering tasks involving code understanding and code generation. However, these models operate in the natural channel of code, i.e., they are primarily concerned with the human understanding of the code. They are not robust to changes in the input and thus, are potentially susceptible to adversarial attacks in the natural channel. We propose, CodeAttack, a simple yet effective black-box attack model that uses code structure to generate effective, efficient, and imperceptible adversarial code samples and demonstrates the vulnerabilities of the state-of-the-art PL models to code-specific adversarial attacks. We evaluate the transferability of CodeAttack on several code-code (translation and repair) and code-NL (summarization) tasks across different programming languages. CodeAttack outperforms state-of-the-art adversarial NLP attack models to achieve the best overall drop in performance while being more efficient, imperceptible, consistent, and fluent. The code can be found at https://github.com/reddy-lab-code-research/CodeAttack.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源