论文标题
CPU和GPU上的平行稀疏张量基准套件
A Parallel Sparse Tensor Benchmark Suite on CPUs and GPUs
论文作者
论文摘要
张量计算提出了重大的性能挑战,这些挑战影响了广泛的应用程序,从机器学习,医疗保健分析,社交网络分析,数据挖掘到量子化学和信号处理。提高张量计算的性能的努力包括探索数据布局,执行调度和平行性张量核中的并行性。这项工作为使用最先进的张量格式提供了一个基准套件:cpus和gpus上的坐标(COO)和层次坐标(hicoo)。它提出了一组参考张量核实现,这些实现与合成图生成技术扩展的现实世界张量和幂定律张量兼容。我们还为这些内核提供了车顶线的性能模型,以提供稀疏张量视图的计算机平台的见解。
Tensor computations present significant performance challenges that impact a wide spectrum of applications ranging from machine learning, healthcare analytics, social network analysis, data mining to quantum chemistry and signal processing. Efforts to improve the performance of tensor computations include exploring data layout, execution scheduling, and parallelism in common tensor kernels. This work presents a benchmark suite for arbitrary-order sparse tensor kernels using state-of-the-art tensor formats: coordinate (COO) and hierarchical coordinate (HiCOO) on CPUs and GPUs. It presents a set of reference tensor kernel implementations that are compatible with real-world tensors and power law tensors extended from synthetic graph generation techniques. We also propose Roofline performance models for these kernels to provide insights of computer platforms from sparse tensor view.