论文标题
神经网络通过有效的滤波器分析和分层修剪来压缩
Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning
论文作者
论文摘要
网络压缩对于使深网的效率更高,更快且可推广到低端硬件至关重要。当前的网络压缩方法有两个开放问题:首先,缺乏理论框架来估计最大压缩率;其次,某些层可能会过多地进行,从而导致网络性能大幅下降。为了解决这两个问题,这项研究提出了一种基于梯度 - 矩阵分析方法,以估计最大网络冗余。在最大速率的指导下,开发了一种新颖有效的层次网络修剪算法,以最大程度地凝结神经元网络结构而无需牺牲网络性能。进行大量实验以证明新方法修剪几个高级卷积神经网络(CNN)结构的功效。与现有的修剪方法相比,拟议的修剪算法实现了最先进的性能。与其他方法相比,在相同或相似的压缩比下,新方法提供了最高的网络预测准确性。
Network compression is crucial to making the deep networks to be more efficient, faster, and generalizable to low-end hardware. Current network compression methods have two open problems: first, there lacks a theoretical framework to estimate the maximum compression rate; second, some layers may get over-prunned, resulting in significant network performance drop. To solve these two problems, this study propose a gradient-matrix singularity analysis-based method to estimate the maximum network redundancy. Guided by that maximum rate, a novel and efficient hierarchical network pruning algorithm is developed to maximally condense the neuronal network structure without sacrificing network performance. Substantial experiments are performed to demonstrate the efficacy of the new method for pruning several advanced convolutional neural network (CNN) architectures. Compared to existing pruning methods, the proposed pruning algorithm achieved state-of-the-art performance. At the same or similar compression ratio, the new method provided the highest network prediction accuracy as compared to other methods.