论文标题

FPIRM:赛道记忆中的浮点处理

FPIRM: Floating-point Processing in Racetrack Memories

论文作者

Ollivier, Sébastien, Zhang, Xinyi, Tang, Yue, Choudhuri, Chayanika, Hu, Jingtong, Jones, Alex K.

论文摘要

卷积神经网络(CNN)已成为一种无处不在的算法,随着移动和边缘设置的应用不断增长。我们使用赛车记忆(RM)来描述一种称为FPIRM的计算机(CIM)技术,以加速CNN的边缘系统。使用横向读取,可以确定“ 1的多个相邻域的数量”的技术,FPIRM可以有效地实现多手术和加法计算,以及两手机和乘法。我们讨论FPIRM如何实现可变精度整数和浮点算术。这允许CNN推理和设备培训,而无需昂贵的数据转移到云。基于这些功能,我们证明了使用RM CIM进行背部传播的几个CNN的实现,并将其与CIM推理和训练的最新实现进行了比较。在培训期间,FPIRM的效率提高了2 $ \ times $,通过将能耗降低至少27%,并将吞吐量提高至少18%。

Convolutional neural networks (CNN) have become a ubiquitous algorithm with growing applications in mobile and edge settings. We describe a compute-in-memory (CIM) technique called FPIRM using Racetrack Memory (RM) to accelerate CNNs for edge systems. Using transverse read, a technique that can determine the number of '1's multiple adjacent domains, FPIRM can efficiently implement multi-operand bulk-bitwise and addition computations, and two-operand multiplication. We discuss how FPIRM can implement both variable precision integer and floating point arithmetic. This allows both CNN inference and on-device training without expensive data movement to the cloud. Based on these functions we demonstrate implementation of several CNNs with back propagation using RM CIM and compare these to state-of-the-art implementations of CIM inference and training in Field-Programmable Gate Arrays. During training FPIRM improves by 2$\times$ the efficiency, by reducing the energy consumption by at least 27% and increasing the throughput by at least 18% against FPGA.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源