论文标题
Wrapnet:具有超低分辨率算术的神经网推断
WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic
论文作者
论文摘要
低分辨率的神经网络既代表着很少的权重和活化,从而大大降低了乘法复杂性。尽管如此,这些产品是使用高分辨率(通常为32位)添加的,该操作在使用极端量化(例如二进制重量)时主导了推断的算术复杂性。为了进一步优化推理,我们提出了一种适应神经网络以在累加器中使用低分辨率(8位)添加的方法,从而实现了与其32位对应物相当的分类精度。我们通过插入循环激活层以及溢出惩罚剂来实现低分辨率积累的韧性。我们证明了我们在软件和硬件平台上的方法的功效。
Low-resolution neural networks represent both weights and activations with few bits, drastically reducing the multiplication complexity. Nonetheless, these products are accumulated using high-resolution (typically 32-bit) additions, an operation that dominates the arithmetic complexity of inference when using extreme quantization (e.g., binary weights). To further optimize inference, we propose a method that adapts neural networks to use low-resolution (8-bit) additions in the accumulators, achieving classification accuracy comparable to their 32-bit counterparts. We achieve resilience to low-resolution accumulation by inserting a cyclic activation layer, as well as an overflow penalty regularizer. We demonstrate the efficacy of our approach on both software and hardware platforms.