使用验证的变压器为单位测试案例生成准确的断言语句

论文标题

使用验证的变压器为单位测试案例生成准确的断言语句

Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers

论文作者

Tufano, Michele, Drain, Dawn, Svyatkovskiy, Alexey, Sundaresan, Neel

论文摘要

单元测试代表了在集成和端到端测试之下软件测试金字塔的基础基础。自动化软件测试研究人员提出了各种技术，以协助开发人员完成这项耗时的任务。在本文中，我们提出了一种通过生成准确且有用的断言语句来支持开发人员编写单元测试案例的方法。我们的方法是基于最初在英语文本语料库上预定的最先进的变压器模型。然后，该语义上丰富的模型将以半监督的方式培训，以大量的源代码进行。最后，我们对为单位测试生成断言语句的任务来确定此模型。所得模型能够为正在测试的给定方法生成准确的断言语句。在我们的经验评估中，该模型能够预测开发人员在第一次尝试中62％的案例中编写的确切声明。结果表明，与先前基于RNN的方法相比，TOP-1准确性的相对提高了80％。我们还展示了预训练过程对模型性能的实质性影响，并将其与断言自动完成任务进行了比较。最后，我们证明了如何使用我们的方法来增强evosuite测试案例，并提供了其他断言，从而改善了测试覆盖率。

Unit testing represents the foundational basis of the software testing pyramid, beneath integration and end-to-end testing. Automated software testing researchers have proposed a variety of techniques to assist developers in this time-consuming task. In this paper we present an approach to support developers in writing unit test cases by generating accurate and useful assert statements. Our approach is based on a state-of-the-art transformer model initially pretrained on an English textual corpus. This semantically rich model is then trained in a semi-supervised fashion on a large corpus of source code. Finally, we finetune this model on the task of generating assert statements for unit tests. The resulting model is able to generate accurate assert statements for a given method under test. In our empirical evaluation, the model was able to predict the exact assert statements written by developers in 62% of the cases in the first attempt. The results show 80% relative improvement for top-1 accuracy over the previous RNN-based approach in the literature. We also show the substantial impact of the pretraining process on the performances of our model, as well as comparing it with assert auto-completion task. Finally, we demonstrate how our approach can be used to augment EvoSuite test cases, with additional asserts leading to improved test coverage.

下载PDF全文

下载文献需遵守相关版权规定

论文标题