针对节点级图神经网络仅标签的会员推断攻击

论文标题

针对节点级图神经网络仅标签的会员推断攻击

Label-Only Membership Inference Attack against Node-Level Graph Neural Networks

论文作者

Conti, Mauro, Li, Jiaxin, Picek, Stjepan, Xu, Jing

论文摘要

受卷积神经网络（CNN）启发的图形神经网络（GNN）汇总了节点邻居的信息和结构信息，以获取节点分类，图形分类和链接预测的节点的表达性表示。先前的研究表明，GNN容易受到会员推理攻击（MIA）的攻击，这些攻击（MIAS）推断出节点是否在GNNS的训练数据中，并泄漏了节点的私人信息，例如患者的疾病史。以前的MIA的实现利用了模型的概率输出，如果GNN仅为输入提供预测标签（仅标签），则是不可行的。在本文中，我们在GNNS的柔性预测机制（例如，即使邻居的信息不可用，也可以获得一个节点的预测标签，借助GNNS的灵活预测机制，我们提出了针对GNN的标签MIA，以进行节点分类。对于大多数数据集和GNN模型，我们的攻击方法实现了曲线（AUC）下60 \％的准确性，精度和面积，其中一些模型比在我们的环境和设置下实施的基于最先进的概率的MIA具有竞争力甚至更好。此外，我们分析了采样方法，模型选择方法和过度拟合水平对仅标签MIA攻击性能的影响。这两个因素都会影响攻击性能。然后，我们考虑有关对手的附加数据集（影子数据集）的假设以及有关目标模型的额外信息的情况。即使在这些情况下，我们仅使用标签的MIA在大多数情况下都能取得更好的攻击性能。最后，我们探讨了可能的防御能力，包括辍学，正则化，归一化和跳跃知识。这四个防御都没有完全阻止我们的进攻。

Graph Neural Networks (GNNs), inspired by Convolutional Neural Networks (CNNs), aggregate the message of nodes' neighbors and structure information to acquire expressive representations of nodes for node classification, graph classification, and link prediction. Previous studies have indicated that GNNs are vulnerable to Membership Inference Attacks (MIAs), which infer whether a node is in the training data of GNNs and leak the node's private information, like the patient's disease history. The implementation of previous MIAs takes advantage of the models' probability output, which is infeasible if GNNs only provide the prediction label (label-only) for the input. In this paper, we propose a label-only MIA against GNNs for node classification with the help of GNNs' flexible prediction mechanism, e.g., obtaining the prediction label of one node even when neighbors' information is unavailable. Our attacking method achieves around 60\% accuracy, precision, and Area Under the Curve (AUC) for most datasets and GNN models, some of which are competitive or even better than state-of-the-art probability-based MIAs implemented under our environment and settings. Additionally, we analyze the influence of the sampling method, model selection approach, and overfitting level on the attack performance of our label-only MIA. Both of those factors have an impact on the attack performance. Then, we consider scenarios where assumptions about the adversary's additional dataset (shadow dataset) and extra information about the target model are relaxed. Even in those scenarios, our label-only MIA achieves a better attack performance in most cases. Finally, we explore the effectiveness of possible defenses, including Dropout, Regularization, Normalization, and Jumping knowledge. None of those four defenses prevent our attack completely.

下载PDF全文

下载文献需遵守相关版权规定

论文标题