论文标题
所有边缘都需要吗?统一的图形净化框架
Are All Edges Necessary? A Unified Framework for Graph Purification
论文作者
论文摘要
图形神经网络(GNN)作为在图形结构数据上工作的深度学习模型,在许多作品中都达到了高级性能。但是,已经反复证明,并非图中的所有边缘对于训练机器学习模型都是必需的。换句话说,节点之间的某些连接可能会为下游任务带来多余甚至误导信息。在本文中,我们尝试提供一种掉落边缘的方法,以便从新的角度净化图形数据。具体而言,这是一个框架,可以净化信息丢失最少的图形,在此框架下,核心问题是如何更好地评估边缘以及如何以最少的信息丢失来删除相对多余的边缘。为了解决以上两个问题,我们提出了几项评估的测量值,以及边缘删除的不同法官和过滤器。我们还介绍了一个残留识别策略和一个替代模型,用于测量需要未知信息。实验结果表明,我们提出的针对KL差异的测量结果具有限制,以保持图形和删除边缘的连通性,并以迭代方式删除边缘可以找到最多的边缘,同时保持GNN的性能。更重要的是,进一步的实验表明,这种方法还实现了针对对抗性攻击的最佳防御性能。
Graph Neural Networks (GNNs) as deep learning models working on graph-structure data have achieved advanced performance in many works. However, it has been proved repeatedly that, not all edges in a graph are necessary for the training of machine learning models. In other words, some of the connections between nodes may bring redundant or even misleading information to downstream tasks. In this paper, we try to provide a method to drop edges in order to purify the graph data from a new perspective. Specifically, it is a framework to purify graphs with the least loss of information, under which the core problems are how to better evaluate the edges and how to delete the relatively redundant edges with the least loss of information. To address the above two problems, we propose several measurements for the evaluation and different judges and filters for the edge deletion. We also introduce a residual-iteration strategy and a surrogate model for measurements requiring unknown information. The experimental results show that our proposed measurements for KL divergence with constraints to maintain the connectivity of the graph and delete edges in an iterative way can find out the most edges while keeping the performance of GNNs. What's more, further experiments show that this method also achieves the best defense performance against adversarial attacks.