论文标题

皮层:递归深度学习模型的编译器

Cortex: A Compiler for Recursive Deep Learning Models

论文作者

Fegade, Pratik, Chen, Tianqi, Gibbons, Phillip B., Mowry, Todd C.

论文摘要

优化深度学习模型通常分为两个步骤:(i)高级图优化,例如内核融合和(ii)低级内核优化,例如在供应商库中发现的核。这种方法通常在桌子上留下了重要的表现,尤其是对于递归深度学习模型的情况。在本文中,我们提出了Cortex,这是一种基于编译器的方法,用于为低潜伏期推断生成高效的递归模型代码。我们的编译器方法和对供应商库的依赖较低,使我们能够进行端到端的优化,从而在过去的工作中,在过去的工作中,在过去的工作中,推理潜伏期低14倍。

Optimizing deep learning models is generally performed in two steps: (i) high-level graph optimizations such as kernel fusion and (ii) low level kernel optimizations such as those found in vendor libraries. This approach often leaves significant performance on the table, especially for the case of recursive deep learning models. In this paper, we present Cortex, a compiler-based approach to generate highly-efficient code for recursive models for low latency inference. Our compiler approach and low reliance on vendor libraries enables us to perform end-to-end optimizations, leading to up to 14X lower inference latencies over past work, across different backends.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源