重新分布addernet量化的权重和激活

论文标题

重新分布addernet量化的权重和激活

Redistribution of Weights and Activations for AdderNet Quantization

论文作者

Nie, Ying, Han, Kai, Diao, Haikang, Liu, Chuanjian, Wu, Enhua, Wang, Yunhe

论文摘要

Adder神经网络（Addernet）通过用更便宜的添加（即L1-Norm）代替昂贵的乘法来开发节能神经网络的新方法。为了达到更高的硬件效率，有必要进一步研究Addernet的低位量化。由于乘以乘法中的交换定律在L1-norm中的限制，因此无法将卷积网络上完善的量化方法应用于addernets上。因此，现有的addernet量化技术建议仅使用一个共享比例来同时量化权重和激活。诚然，这种方法可以在L1-norm量化过程中保持交换定律，而低位量化后的精度下降不容忽视。为此，我们首先彻底分析了addernet中权重和激活的分布差异，然后通过重新分布重量和激活来提出一种新的量化算法。具体而言，不同核中的预训练的全精度重量聚集成不同的组，然后可以采用组内共享和组间独立尺度。为了进一步补偿分布差异引起的准确性下降，然后我们为权重和简单而有效的离群夹夹策略开发了无损范围夹具方案以进行激活。因此，可以完全保留全精度权重的功能和全精度激活的表示能力。在多个基准测试中，对Addernet提出的量化方法的有效性得到了很好的验证，例如，我们的4位训练后量化量化的加法器RESNET-18实现了具有可比能效率的Imagenet上的66.5％TOP-1准确性，比以前的AdderNet量化方法高约8.5％。

Adder Neural Network (AdderNet) provides a new way for developing energy-efficient neural networks by replacing the expensive multiplications in convolution with cheaper additions (i.e.l1-norm). To achieve higher hardware efficiency, it is necessary to further study the low-bit quantization of AdderNet. Due to the limitation that the commutative law in multiplication does not hold in l1-norm, the well-established quantization methods on convolutional networks cannot be applied on AdderNets. Thus, the existing AdderNet quantization techniques propose to use only one shared scale to quantize both the weights and activations simultaneously. Admittedly, such an approach can keep the commutative law in the l1-norm quantization process, while the accuracy drop after low-bit quantization cannot be ignored. To this end, we first thoroughly analyze the difference on distributions of weights and activations in AdderNet and then propose a new quantization algorithm by redistributing the weights and the activations. Specifically, the pre-trained full-precision weights in different kernels are clustered into different groups, then the intra-group sharing and inter-group independent scales can be adopted. To further compensate the accuracy drop caused by the distribution difference, we then develop a lossless range clamp scheme for weights and a simple yet effective outliers clamp strategy for activations. Thus, the functionality of full-precision weights and the representation ability of full-precision activations can be fully preserved. The effectiveness of the proposed quantization method for AdderNet is well verified on several benchmarks, e.g., our 4-bit post-training quantized adder ResNet-18 achieves an 66.5% top-1 accuracy on the ImageNet with comparable energy efficiency, which is about 8.5% higher than that of the previous AdderNet quantization methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题