论文标题
大规模上有效的图形神经网络推断
Efficient Graph Neural Network Inference at Large Scale
论文作者
论文摘要
图形神经网络(GNN)在广泛的应用中表现出了出色的性能。但是,在实时推理方案下,大规模图的尺寸大大阻碍了其应用程序。尽管现有的可伸缩GNN利用线性传播来预处理特征并加速训练和推理程序,但在推断出看不见的节点时,这些方法仍然会遇到可伸缩性问题,因为该特征预处理需要该图表已知和固定。为了加快归纳环境中的推论,我们提出了一种新型的自适应传播顺序方法,该方法根据其拓扑信息为每个节点生成个性化的传播顺序。这可以成功避免特征传播的冗余计算。此外,可以通过简单的超参数可以灵活控制准确性和推理潜伏期之间的权衡,以匹配应用程序方案的不同延迟约束。为了弥补潜在的推理准确性损失,我们进一步提出了发动蒸馏以利用多量表接收信息并改善推理性能。广泛的实验是在四个具有不同尺度和特征的公共数据集上进行的,并且实验结果表明,我们提出的推理加速框架的表现优于SOTA图推理加速基线,从精度和效率方面。特别是,我们提出的方法的优势在大型数据集上更为重要,而我们的框架在最大的OGBN-Products数据集上实现了$ 75 \ times $推理速度。
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications. However, the enormous size of large-scale graphs hinders their applications under real-time inference scenarios. Although existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure, these methods still suffer from scalability issues when making inferences on unseen nodes, as the feature preprocessing requires the graph is known and fixed. To speed up the inference in the inductive setting, we propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information. This could successfully avoid the redundant computation of feature propagation. Moreover, the trade-off between accuracy and inference latency can be flexibly controlled by simple hyper-parameters to match different latency constraints of application scenarios. To compensate for the potential inference accuracy loss, we further propose Inception Distillation to exploit the multi scale reception information and improve the inference performance. Extensive experiments are conducted on four public datasets with different scales and characteristics, and the experimental results show that our proposed inference acceleration framework outperforms the SOTA graph inference acceleration baselines in terms of both accuracy and efficiency. In particular, the advantage of our proposed method is more significant on larger-scale datasets, and our framework achieves $75\times$ inference speedup on the largest Ogbn-products dataset.