要了解深度学习框架错误

论文标题

要了解深度学习框架错误

Toward Understanding Deep Learning Framework Bugs

论文作者

Chen, Junjie, Liang, Yihua, Shen, Qingchao, Jiang, Jiajun, Li, Shuochuan

论文摘要

DL框架是构建所有DL程序和模型的基础，因此它们的错误可能导致任何DL程序的意外行为或依靠它们的模型。如此广泛的效果表明了保证DL框架质量的必要性和重要性。了解DL框架错误的特征是这项质量保证任务的基本步骤，促进设计有效的错误检测和调试方法。因此，在这项工作中，我们对四个流行和多样的DL框架（即Tensorflow，Pytorch，MXNET和DL4J）进行了1,000个错误进行了最大的研究。通过分析与从DL框架分解的5个组件相关的DL框架错误的根本原因和症状，并测量通过三种最先进的测试技术实现的测试覆盖范围，我们获得了12个主要发现，以全面了解DL框架错误和现有DL框架测试实践的当前状态，然后为CAMENTING OFFING和DEMING OFFING和DEBLING提供了DL框架。最后，根据准则，我们设计和实施了一个名为Tenfuzz的原型DL-Framework测试工具，该工具被评估为有效，并在初步研究中在最新的Tensorflow框架上找到了3个未知错误，表明我们的准则的重要性。

DL frameworks are the basis of constructing all DL programs and models, and thus their bugs could lead to the unexpected behaviors of any DL program or model relying on them. Such a wide effect demonstrates the necessity and importance of guaranteeing DL frameworks' quality. Understanding the characteristics of DL framework bugs is a fundamental step for this quality assurance task, facilitating designing effective bug detection and debugging approaches. Hence, in this work we conduct the most large-scale study on 1,000 bugs from four popular and diverse DL frameworks (i.e., TensorFlow, PyTorch, MXNet, and DL4J). By analyzing the root causes and symptoms of DL framework bugs associated with 5 components decomposed from DL frameworks, as well as measuring test coverage achieved by three state-of-the-art testing techniques, we obtain 12 major findings for the comprehensive understanding of DL framework bugs and the current status of existing DL framework testing practice, and then provide a series of actionable guidelines for better DL framework bug detection and debugging. Finally, based on the guidelines, we design and implement a prototype DL-framework testing tool, called TenFuzz, which is evaluated to be effective and finds 3 unknown bugs on the latest TensorFlow framework in a preliminary study, indicating the significance of our guidelines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题