全球注意力改善图形网络的概括

论文标题

全球注意力改善图形网络的概括

Global Attention Improves Graph Networks Generalization

论文作者

Puny, Omri, Ben-Hamu, Heli, Lipman, Yaron

论文摘要

本文主张将低级别的全球注意力（LRGA）模块（dot-ropopoptuct Guate的计算和记忆有效变体（Vaswani等，2017））绘制图形神经网络（GNN）以提高其概括能力。为了从理论上量化通过将LRGA模块添加到GNNS所赋予的概括属性，我们专注于一种特定的表现性GNN家族，并表明使用LRGA扩大它为强大的图形同构测试提供了算法对齐，即2-Folklore weisfeiler-Lehman（2-frithman）（2-fwll）AlgorithM）AlgorithM）。更详细地说：（i）考虑最近的随机图神经网络（RGNN）（Sato等，2020）框架，并证明它在概率上是普遍的；（ii）表明，RGNN通过LRGA对准通过多项式内核进行了2-FWL更新步骤；（iii）用随机初始化的两层MLP学习时，结合了内核特征图的样品复杂性。从实际的角度来看，使用LRGA增加现有的GNN层产生了最新的GNN基准。最后，我们观察到，使用LRGA增加各种GNN体系结构通常会缩小不同模型之间的性能差距。

This paper advocates incorporating a Low-Rank Global Attention (LRGA) module, a computation and memory efficient variant of the dot-product attention (Vaswani et al., 2017), to Graph Neural Networks (GNNs) for improving their generalization power. To theoretically quantify the generalization properties granted by adding the LRGA module to GNNs, we focus on a specific family of expressive GNNs and show that augmenting it with LRGA provides algorithmic alignment to a powerful graph isomorphism test, namely the 2-Folklore Weisfeiler-Lehman (2-FWL) algorithm. In more detail we: (i) consider the recent Random Graph Neural Network (RGNN) (Sato et al., 2020) framework and prove that it is universal in probability; (ii) show that RGNN augmented with LRGA aligns with 2-FWL update step via polynomial kernels; and (iii) bound the sample complexity of the kernel's feature map when learned with a randomly initialized two-layer MLP. From a practical point of view, augmenting existing GNN layers with LRGA produces state of the art results in current GNN benchmarks. Lastly, we observe that augmenting various GNN architectures with LRGA often closes the performance gap between different models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题