论文标题

用于代表Python程序作为机器学习的图表的库

A Library for Representing Python Programs as Graphs for Machine Learning

论文作者

Bieber, David, Shi, Kensen, Maniatis, Petros, Sutton, Charles, Hellendoorn, Vincent, Johnson, Daniel, Tarlow, Daniel

论文摘要

程序的图表通常是用于代码研究的机器学习的核心要素。我们介绍了一个开源Python库Python_graphs,该图表应用静态分析来构建适合培训机器学习模型的Python程序的图表。我们的图书馆承认控制流图,数据流图和复合``程序图''的构建,这些图形结合了有关程序的控制流,数据流,句法和词汇信息。我们介绍了图书馆的功能和局限性,进行案例研究,将图书馆应用于数百万竞争性的编程提交,并展示图书馆用于机器学习研究的实用程序。

Graph representations of programs are commonly a central element of machine learning for code research. We introduce an open source Python library python_graphs that applies static analysis to construct graph representations of Python programs suitable for training machine learning models. Our library admits the construction of control-flow graphs, data-flow graphs, and composite ``program graphs'' that combine control-flow, data-flow, syntactic, and lexical information about a program. We present the capabilities and limitations of the library, perform a case study applying the library to millions of competitive programming submissions, and showcase the library's utility for machine learning research.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源