通过神经语言模型完成完整的代码完成

论文标题

通过神经语言模型完成完整的代码完成

Towards Full-line Code Completion with Neural Language Models

论文作者

Wang, Wenhan, Shen, Sijie, Li, Ge, Jin, Zhi

论文摘要

代码完成系统建议给定部分完整的代码段，向开发人员提出了未来的代码元素。代码完成是集成开发环境（IDE）中最有用的功能之一。当前，大多数代码完成技术一次都可以预测一个单一令牌。在本文中，我们采取了进一步的一步，讨论直接完成整个代码而不是单个令牌的可能性。我们认为，更长的代码序列可以进一步提高开发人员的效率。最近，神经语言模型被用作代码完成的首选方法，我们认为这些模型仍然可以应用于全线代码完成，并进行一些改进。我们对两个现实世界的Python Corpora进行了实验，并根据源代码令牌或句法动作评估现有的神经模型。结果表明，神经语言模型可以在我们的任务上获得可接受的结果，并有很大的改进空间。

A code completion system suggests future code elements to developers given a partially-complete code snippet. Code completion is one of the most useful features in Integrated Development Environments (IDEs). Currently, most code completion techniques predict a single token at a time. In this paper, we take a further step and discuss the probability of directly completing a whole line of code instead of a single token. We believe suggesting longer code sequences can further improve the efficiency of developers. Recently neural language models have been adopted as a preferred approach for code completion, and we believe these models can still be applied to full-line code completion with a few improvements. We conduct our experiments on two real-world python corpora and evaluate existing neural models based on source code tokens or syntactical actions. The results show that neural language models can achieve acceptable results on our tasks, with significant room for improvements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题