语言模型可以教会自己更好地编程

论文标题

语言模型可以教会自己更好地编程

Language Models Can Teach Themselves to Program Better

论文作者

Haluptzok, Patrick, Bowers, Matthew, Kalai, Adam Tauman

论文摘要

最近的语言模型（LMS）在接受人为实现的问题进行培训时，即使解决了一些竞争性编程问题，在代码生成中实现了突破性。事实证明，自我播放在诸如GO之类的游戏中很有用，因此自然要问LMS是否可以产生自己的启发性编程问题以提高其性能。我们表明，LM有可能合成编程问题和解决方案，这些问题和解决方案被Python解释器过滤了。然后可以看到LM的性能在对其自身的合成问题和经过验证的解决方案进行微调时会有所改善；因此，模型使用Python解释器“改善自身”。问题正式指定为编程难题[Schuster等，2021]，这是一种基于代码的问题格式，可以轻松地通过执行来验证解决方案以进行正确性。在公开可用LMS的实验中，测试准确性多倍多。这项工作证明了与解释器一起产生指导性问题并改善自己的表现的代码LMS的潜力。

Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on human-authored problems, even solving some competitive-programming problems. Self-play has proven useful in games such as Go, and thus it is natural to ask whether LMs can generate their own instructive programming problems to improve their performance. We show that it is possible for an LM to synthesize programming problems and solutions, which are filtered for correctness by a Python interpreter. The LM's performance is then seen to improve when it is fine-tuned on its own synthetic problems and verified solutions; thus the model 'improves itself' using the Python interpreter. Problems are specified formally as programming puzzles [Schuster et al., 2021], a code-based problem format where solutions can easily be verified for correctness by execution. In experiments on publicly-available LMs, test accuracy more than doubles. This work demonstrates the potential for code LMs, with an interpreter, to generate instructive problems and improve their own performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题