关于随机修剪神经网络的神经切线内核分析

论文标题

关于随机修剪神经网络的神经切线内核分析

On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks

论文作者

Yang, Hongru, Wang, Zhangyang

论文摘要

在理论和实践中，我们研究了权重的随机修剪如何影响神经网络的神经切线核（NTK）。特别是，这项工作确立了完全连接的神经网络及其随机修剪版本之间的NTK的等效性。等效性是在两种情况下建立的。第一个主要结果研究了无限宽度渐近性。结果表明，给定修剪概率，对于在初始化时随机修剪的权重的完全连接的神经网络，随着每一层的宽度依次增长到无穷大，修剪的神经网络的NTK收敛到原始网络的限制性NTK，并具有一些额外的缩放。如果修剪后的网络权重适当地重新缩放，则可以去除此额外的缩放。第二个主要结果考虑了有限宽度的情况。结果表明，为了确保NTK与极限的接近性，宽度对稀疏参数的依赖性是渐近线性的，因为NTK的差距达到其极限为零。此外，如果修剪概率设置为零（即无修剪），则所需宽度上的绑定匹配以前的对数因子的完全连接神经网络的界限。该结果的证明需要对我们称为\ textit {掩码诱导的伪网络}的网络结构进行新的分析。提供实验以评估我们的结果。

Motivated by both theory and practice, we study how random pruning of the weights affects a neural network's neural tangent kernel (NTK). In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version. The equivalence is established under two cases. The first main result studies the infinite-width asymptotic. It is shown that given a pruning probability, for fully-connected neural networks with the weights randomly pruned at the initialization, as the width of each layer grows to infinity sequentially, the NTK of the pruned neural network converges to the limiting NTK of the original network with some extra scaling. If the network weights are rescaled appropriately after pruning, this extra scaling can be removed. The second main result considers the finite-width case. It is shown that to ensure the NTK's closeness to the limit, the dependence of width on the sparsity parameter is asymptotically linear, as the NTK's gap to its limit goes down to zero. Moreover, if the pruning probability is set to zero (i.e., no pruning), the bound on the required width matches the bound for fully-connected neural networks in previous works up to logarithmic factors. The proof of this result requires developing a novel analysis of a network structure which we called \textit{mask-induced pseudo-networks}. Experiments are provided to evaluate our results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题