论文标题
C代码生成器,用于快速推理和简单地部署资源约束系统上的卷积神经网络
A C Code Generator for Fast Inference and Simple Deployment of Convolutional Neural Networks on Resource Constrained Systems
论文作者
论文摘要
临时应用程序的卷积神经网络的推论通常需要GPU。在机器人技术或嵌入式设备中,这些设备通常由于能源,空间和成本限制而无法获得。此外,不可能在目标平台上安装深度学习框架甚至本机编译器。本文提出了一个神经网络代码生成器(NNCG),该生成训练有素的CNN的普通ANSI C代码文件,该文件封装了单个A函数中的推断。它可以很容易地包含在现有项目中,并且由于缺乏依赖关系,通常是可能的。此外,根据四个设计原则,根据已知的训练有素的CNN和目标平台对代码生成进行了优化。使用为该应用程序设计的小型CNN评估了该系统。与Tensorflow XLA相比,最多可显示11.81的发光速度,甚至GPU在延迟方面的表现都优于。
Inference of Convolutional Neural Networks in time critical applications usually requires a GPU. In robotics or embedded devices these are often not available due to energy, space and cost constraints. Furthermore, installation of a deep learning framework or even a native compiler on the target platform is not possible. This paper presents a neural network code generator (NNCG) that generates from a trained CNN a plain ANSI C code file that encapsulates the inference in single a function. It can easily be included in existing projects and due to lack of dependencies, cross compilation is usually possible. Additionally, the code generation is optimized based on the known trained CNN and target platform following four design principles. The system is evaluated utilizing small CNN designed for this application. Compared to TensorFlow XLA and Glow speed-ups of up to 11.81 can be shown and even GPUs are outperformed regarding latency.