皮层：递归深度学习模型的编译器

论文标题

皮层：递归深度学习模型的编译器

Cortex: A Compiler for Recursive Deep Learning Models

论文作者

Fegade, Pratik, Chen, Tianqi, Gibbons, Phillip B., Mowry, Todd C.

论文摘要

优化深度学习模型通常分为两个步骤：（i）高级图优化，例如内核融合和（ii）低级内核优化，例如在供应商库中发现的核。这种方法通常在桌子上留下了重要的表现，尤其是对于递归深度学习模型的情况。在本文中，我们提出了Cortex，这是一种基于编译器的方法，用于为低潜伏期推断生成高效的递归模型代码。我们的编译器方法和对供应商库的依赖较低，使我们能够进行端到端的优化，从而在过去的工作中，在过去的工作中，在过去的工作中，推理潜伏期低14倍。

Optimizing deep learning models is generally performed in two steps: (i) high-level graph optimizations such as kernel fusion and (ii) low level kernel optimizations such as those found in vendor libraries. This approach often leaves significant performance on the table, especially for the case of recursive deep learning models. In this paper, we present Cortex, a compiler-based approach to generate highly-efficient code for recursive models for low latency inference. Our compiler approach and low reliance on vendor libraries enables us to perform end-to-end optimizations, leading to up to 14X lower inference latencies over past work, across different backends.

下载PDF全文

下载文献需遵守相关版权规定

论文标题