论文标题
绕过环境维度:带有梯度子空间标识的私人SGD
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
论文作者
论文摘要
差异化SGD(DP-SGD)是解决差异性私人经验风险最小化(ERM)的最流行方法之一。由于其在每个梯度更新上的嘈杂扰动,DP-SGD量表的错误率带有环境尺寸$ p $,即模型中的参数数量。这种依赖性对于$ p \ gg n $(培训样本数量)的过度参数化模型可能是有问题的。私人ERM上的现有下限表明,在最坏的情况下,这种对$ P $的依赖是不可避免的。在本文中,我们通过利用深网中梯度空间的低维结构来规避对环境维度的依赖,即,深网的随机梯度通常在训练过程中保持在低维的子空间中。我们提出了预计的DP-SGD,该DP-SGD通过将嘈杂的梯度投影到低维子空间来降低噪声,该子空间由顶级eigenspace在一个小的公共数据集上给出。我们在公共数据集上为梯度子空间识别问题提供了一般的样本复杂性分析,并证明在某些低维假设下,公共样本复杂性仅在$ p $中增长。最后,我们提供了理论分析和经验评估,以表明我们的方法可以基本上提高DP-SGD在高隐私制度中的准确性(对应于低隐私损失$ε$)。
Differentially private SGD (DP-SGD) is one of the most popular methods for solving differentially private empirical risk minimization (ERM). Due to its noisy perturbation on each gradient update, the error rate of DP-SGD scales with the ambient dimension $p$, the number of parameters in the model. Such dependence can be problematic for over-parameterized models where $p \gg n$, the number of training samples. Existing lower bounds on private ERM show that such dependence on $p$ is inevitable in the worst case. In this paper, we circumvent the dependence on the ambient dimension by leveraging a low-dimensional structure of gradient space in deep networks -- that is, the stochastic gradients for deep nets usually stay in a low dimensional subspace in the training process. We propose Projected DP-SGD that performs noise reduction by projecting the noisy gradients to a low-dimensional subspace, which is given by the top gradient eigenspace on a small public dataset. We provide a general sample complexity analysis on the public dataset for the gradient subspace identification problem and demonstrate that under certain low-dimensional assumptions the public sample complexity only grows logarithmically in $p$. Finally, we provide a theoretical analysis and empirical evaluations to show that our method can substantially improve the accuracy of DP-SGD in the high privacy regime (corresponding to low privacy loss $ε$).