论文标题
Fano的不平等的扩展,以表征模型对会员推理攻击的敏感性
An Extension of Fano's Inequality for Characterizing Model Susceptibility to Membership Inference Attacks
论文作者
论文摘要
深度神经网络已被证明容易受到会员推理攻击的攻击,其中攻击者旨在检测是否使用特定的输入数据来训练模型。这些攻击可能会泄漏私人或专有数据。我们提出了FANO不平等的新扩展,并利用它来确定会员推理对深神经网络的成功概率可以使用其输入及其激活之间的共同信息来界定。这使使用共同信息可以测量DNN模型对成员推理攻击的敏感性。在我们的经验评估中,我们表明,CIFAR-10,SVHN和GTSRB模型的互信息与DNN模型对成员推理攻击的易感性分别为0.966、0.996和0.955。
Deep neural networks have been shown to be vulnerable to membership inference attacks wherein the attacker aims to detect whether specific input data were used to train the model. These attacks can potentially leak private or proprietary data. We present a new extension of Fano's inequality and employ it to theoretically establish that the probability of success for a membership inference attack on a deep neural network can be bounded using the mutual information between its inputs and its activations. This enables the use of mutual information to measure the susceptibility of a DNN model to membership inference attacks. In our empirical evaluation, we show that the correlation between the mutual information and the susceptibility of the DNN model to membership inference attacks is 0.966, 0.996, and 0.955 for CIFAR-10, SVHN and GTSRB models, respectively.