论文标题

超越统一的Lipschitz条件,以差异性优化

Beyond Uniform Lipschitz Condition in Differentially Private Optimization

论文作者

Das, Rudrajit, Kale, Satyen, Xu, Zheng, Zhang, Tong, Sanghavi, Sujay

论文摘要

在均匀的Lipschitzness的简单假设下,即每个样本梯度均匀界限的大多数先前的私有随机梯度下降(DP-SGD)的结果是在均匀Lipschitzness的简单假设下得出的。我们通过假设每样本梯度具有样品依赖性上限,即每样本的Lipschitz常数,从而概括了均匀的Lipschitzness,它们本身可能是无限的。我们提供了针对DP-SGD中选择剪辑标准的原则指导,以使凸的过度参数化设置满足我们的一般版本时,当每样本的Lipschitz常数有限时;具体而言,我们建议仅调整夹子规范,直到值最多到每样本的Lipschitz常数为最小。这在公共数据预先训练的深度网络之上,在私人培训中找到了应用程序。我们通过在8个数据集上的实验验证建议的功效。此外,当Lipschitz常数无绑定但具有有限的矩,即它们是重尾时,我们为DP-SGD提供了DP-SGD的新收敛结果。

Most prior results on differentially private stochastic gradient descent (DP-SGD) are derived under the simplistic assumption of uniform Lipschitzness, i.e., the per-sample gradients are uniformly bounded. We generalize uniform Lipschitzness by assuming that the per-sample gradients have sample-dependent upper bounds, i.e., per-sample Lipschitz constants, which themselves may be unbounded. We provide principled guidance on choosing the clip norm in DP-SGD for convex over-parameterized settings satisfying our general version of Lipschitzness when the per-sample Lipschitz constants are bounded; specifically, we recommend tuning the clip norm only till values up to the minimum per-sample Lipschitz constant. This finds application in the private training of a softmax layer on top of a deep network pre-trained on public data. We verify the efficacy of our recommendation via experiments on 8 datasets. Furthermore, we provide new convergence results for DP-SGD on convex and nonconvex functions when the Lipschitz constants are unbounded but have bounded moments, i.e., they are heavy-tailed.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源