论文标题

CPU和GPU上动态图神经网络推断的瓶颈分析

Bottleneck Analysis of Dynamic Graph Neural Network Inference on CPU and GPU

论文作者

Chen, Hanqiu, Alhinai, Yahya, Jiang, Yihan, Na, Eunjee, Hao, Cong

论文摘要

动态图神经网络(DGNN)越来越流行,因为它广泛用于捕获现实世界中的动态特征。从算法角度设计的各种动态图神经网络已成功地将时间信息纳入图形处理。尽管有希望的算法性能,但由于模型的复杂性,多样性和时间依赖性的性质,在硬件上部署DGNN仍带来了其他挑战。同时,DGNN和静态图神经网络之间的差异使与硬件相关的静态图神经网络的优化不适合DGNN。在本文中,我们选择了八个具有不同特征的流行DGNN,并在CPU和GPU上概述了它们。概括和分析了分析结果,为硬件上DGNN的瓶颈提供了深入的见解,并确定了未来DGNN加速的潜在优化机会。随后进行了全面的调查,我们对硬件的DGNN性能瓶颈进行了详细的分析,包括时间数据依赖性,工作负载不平衡,数据移动和GPU热身。我们从软件和硬件角度提出了几种优化。本文是第一个在https://github.com/sharc-lab/dgnn_analysis上提供对DGNN代码硬件性能的深入分析的文章。

Dynamic graph neural network (DGNN) is becoming increasingly popular because of its widespread use in capturing dynamic features in the real world. A variety of dynamic graph neural networks designed from algorithmic perspectives have succeeded in incorporating temporal information into graph processing. Despite the promising algorithmic performance, deploying DGNNs on hardware presents additional challenges due to the model complexity, diversity, and the nature of the time dependency. Meanwhile, the differences between DGNNs and static graph neural networks make hardware-related optimizations for static graph neural networks unsuitable for DGNNs. In this paper, we select eight prevailing DGNNs with different characteristics and profile them on both CPU and GPU. The profiling results are summarized and analyzed, providing in-depth insights into the bottlenecks of DGNNs on hardware and identifying potential optimization opportunities for future DGNN acceleration. Followed by a comprehensive survey, we provide a detailed analysis of DGNN performance bottlenecks on hardware, including temporal data dependency, workload imbalance, data movement, and GPU warm-up. We suggest several optimizations from both software and hardware perspectives. This paper is the first to provide an in-depth analysis of the hardware performance of DGNN Code is available at https://github.com/sharc-lab/DGNN_analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源