论文标题
在弹性存储中优化张量程序
Optimizing Tensor Programs on Flexible Storage
论文作者
论文摘要
张量程序通常需要处理大型张量(向量,矩阵或高阶张量),这些张量需要专门的存储格式来进行内存布局。文献中已经提出了几种这样的布局,例如坐标格式,压缩的稀疏行格式以及许多其他旨在最佳地储存具有特定稀疏性能的张量的尤其设计的。但是,现有的张量处理系统需要专门的扩展名,以利用每种新的存储格式。在本文中,我们描述了一个系统,该系统允许用户以声明性张量查询语言定义灵活的存储格式,类似于Tensor程序使用的语言。程序员只需要编写存储映射,以声明性的方式描述了如何在主内存中布置张量。然后,我们描述了一个基于成本的优化器,该优化器优化了特定内存布局的张量程序。与最先进的张量处理系统相比,我们证明了经验上显着的性能改善。
Tensor programs often need to process large tensors (vectors, matrices, or higher order tensors) that require a specialized storage format for their memory layout. Several such layouts have been proposed in the literature, such as the Coordinate Format, the Compressed Sparse Row format, and many others, that were especially designed to optimally store tensors with specific sparsity properties. However, existing tensor processing systems require specialized extensions in order to take advantage of every new storage format. In this paper we describe a system that allows users to define flexible storage formats in a declarative tensor query language, similar to the language used by the tensor program. The programmer only needs to write storage mappings, which describe, in a declarative way, how the tensors are laid out in main memory. Then, we describe a cost-based optimizer that optimizes the tensor program for the specific memory layout. We demonstrate empirically significant performance improvements compared to state-of-the-art tensor processing systems.