论文标题
LTU攻击者的会员推理
LTU Attacker for Membership Inference
论文作者
论文摘要
我们解决了捍卫预测模型的问题,例如机器学习分类器(辩护人模型),反对成员推理攻击,在培训师和训练有素的模型都公开发布时,都在黑盒和白框设置中。辩护人旨在优化双重目标:实用性和隐私。用外部设备(包括攻击者和评估者)评估了公用事业和隐私。一方面,与防御者训练数据相似的保留数据用于评估效用;另一方面,保留数据(与防御者训练数据混合在一起)用于评估成员推理攻击鲁棒性。在这两种情况下,分类精度或错误率都用作指标:使用防御者模型的分类精度评估效用;使用所谓的“ out-two未标记” LTU攻击者的会员预测错误评估隐私,该错误访问了所有辩护人和保留数据,除了每个样本中的一个样本的成员标签。我们证明,在某些条件下,即使是“幼稚”的LTU攻击者也可以通过简单的攻击策略实现隐私损失的下限,从而导致具体的必要条件保护隐私,包括:防止过度拟合并增加一定的随机性。但是,我们还表明,这种幼稚的LTU攻击者可能无法攻击文献中已知的模型的隐私,这表明知识必须与强大的攻击策略相辅相成,以将LTU攻击者转变为评估隐私的有力手段。我们对QMNIST和CIFAR-10数据集进行的实验验证了我们的理论结果,并确认了预防过度预防和随机性在算法中的作用,以防止隐私攻击。
We address the problem of defending predictive models, such as machine learning classifiers (Defender models), against membership inference attacks, in both the black-box and white-box setting, when the trainer and the trained model are publicly released. The Defender aims at optimizing a dual objective: utility and privacy. Both utility and privacy are evaluated with an external apparatus including an Attacker and an Evaluator. On one hand, Reserved data, distributed similarly to the Defender training data, is used to evaluate Utility; on the other hand, Reserved data, mixed with Defender training data, is used to evaluate membership inference attack robustness. In both cases classification accuracy or error rate are used as the metric: Utility is evaluated with the classification accuracy of the Defender model; Privacy is evaluated with the membership prediction error of a so-called "Leave-Two-Unlabeled" LTU Attacker, having access to all of the Defender and Reserved data, except for the membership label of one sample from each. We prove that, under certain conditions, even a "naïve" LTU Attacker can achieve lower bounds on privacy loss with simple attack strategies, leading to concrete necessary conditions to protect privacy, including: preventing over-fitting and adding some amount of randomness. However, we also show that such a naïve LTU Attacker can fail to attack the privacy of models known to be vulnerable in the literature, demonstrating that knowledge must be complemented with strong attack strategies to turn the LTU Attacker into a powerful means of evaluating privacy. Our experiments on the QMNIST and CIFAR-10 datasets validate our theoretical results and confirm the roles of over-fitting prevention and randomness in the algorithms to protect against privacy attacks.