二进制神经网络作为一种通用通用计算范式用于设备计算机视觉

论文标题

二进制神经网络作为一种通用通用计算范式用于设备计算机视觉

Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

论文作者

Nie, Guhong, Xiao, Lirui, Zhu, Menglong, Chu, Dongliang, Shen, Yue, Li, Peng, Yang, Kang, Du, Li, Chen, Bo

论文摘要

为了使二进制神经网络（BNN）成为主流设备的计算机视觉算法，它们必须实现比8位量化的速度差异级别的优势，并在视觉任务中建立相似的一般适用性。为此，我们提出了一个包括硬件友好性的简约推理方案的BNN框架，2）高准确性的过度参数化培训方案，以及3）一个简单的程序，以适应不同的视觉任务。最终的框架超过了速度-VS-准确性权衡的8位量化，以进行分类，检测，细分，超分辨率和匹配：我们的BNN不仅保留了其8位基线的准确性水平，而且还显示了1.3-2.4 $ \ tims times times times times times times times times in Mobile Cpus上的$ fps fps fps。对于基于原型收缩期阵列的AI加速器，可以得出类似的结论，我们的BNNS承诺2.8-7 $ \ times $ \ $ \ $ \ $ \ $ \ $ \ 8位的执行周期比8位和2.1-2.7 $ \乘以$ \乘以$ \ $ \ $ \少于其他BNN设计。这些结果表明，大规模BNN采用的时间可能会在我们身上。

For binary neural networks (BNNs) to become the mainstream on-device computer vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than 8-bit quantization and establish a similar degree of general applicability in vision tasks. To this end, we propose a BNN framework comprising 1) a minimalistic inference scheme for hardware-friendliness, 2) an over-parameterized training scheme for high accuracy, and 3) a simple procedure to adapt to different vision tasks. The resultant framework overtakes 8-bit quantization in the speed-vs-accuracy tradeoff for classification, detection, segmentation, super-resolution and matching: our BNNs not only retain the accuracy levels of their 8-bit baselines but also showcase 1.3-2.4$\times$ faster FPS on mobile CPUs. Similar conclusions can be drawn for prototypical systolic-array-based AI accelerators, where our BNNs promise 2.8-7$\times$ fewer execution cycles than 8-bit and 2.1-2.7$\times$ fewer cycles than alternative BNN designs. These results suggest that the time for large-scale BNN adoption could be upon us.

下载PDF全文

下载文献需遵守相关版权规定

论文标题