标签泄漏和防止垂直联合学习中向前嵌入的标签

论文标题

标签泄漏和防止垂直联合学习中向前嵌入的标签

Label Leakage and Protection from Forward Embedding in Vertical Federated Learning

论文作者

Sun, Jiankai, Yang, Xin, Yao, Yuanshun, Wang, Chong

论文摘要

近年来，垂直联合学习（VFL）引起了很多关注，并被部署以解决机器学习问题，并与数据隐私有关。但是，最近的一些工作表明，即使只有前向中间嵌入（而不是原始功能）和反向传播的梯度（而不是原始标签）在参与者之间传达了VFL容易受到隐私泄漏的影响。由于原始标签通常包含高度敏感的信息，因此提出了一些最近的工作，以防止在VFL中有效地反向传播梯度泄漏。但是，这些工作仅确定并捍卫了反向传播梯度的标签泄漏威胁。这些工作都没有关注中间嵌入的标签泄漏问题。在本文中，我们提出了一种实用的标签推理方法，即使应用了一些现有的保护方法，例如标签差异隐私和梯度扰动，该方法也可以从共享的中间嵌入中有效地窃取专用标签。标签攻击的有效性与中间嵌入与相应的专用标签之间的相关性密不可分。为了减轻前向嵌入的标签泄漏问题，我们在标签方上添加了一个附加的优化目标，以限制对手的标签窃取能力，通过最大程度地减少中间嵌入和相应的专用标签之间的距离相关性。我们进行了大规模的实验，以证明我们提出的保护方法的有效性。

Vertical federated learning (vFL) has gained much attention and been deployed to solve machine learning problems with data privacy concerns in recent years. However, some recent work demonstrated that vFL is vulnerable to privacy leakage even though only the forward intermediate embedding (rather than raw features) and backpropagated gradients (rather than raw labels) are communicated between the involved participants. As the raw labels often contain highly sensitive information, some recent work has been proposed to prevent the label leakage from the backpropagated gradients effectively in vFL. However, these work only identified and defended the threat of label leakage from the backpropagated gradients. None of these work has paid attention to the problem of label leakage from the intermediate embedding. In this paper, we propose a practical label inference method which can steal private labels effectively from the shared intermediate embedding even though some existing protection methods such as label differential privacy and gradients perturbation are applied. The effectiveness of the label attack is inseparable from the correlation between the intermediate embedding and corresponding private labels. To mitigate the issue of label leakage from the forward embedding, we add an additional optimization goal at the label party to limit the label stealing ability of the adversary by minimizing the distance correlation between the intermediate embedding and corresponding private labels. We conducted massive experiments to demonstrate the effectiveness of our proposed protection methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题