实验实现的回忆记忆增强神经网络

论文标题

实验实现的回忆记忆增强神经网络

Experimentally realized memristive memory augmented neural network

论文作者

Mao, Ruibin, Wen, Bo, Zhao, Yahui, Kazemi, Arman, Laguna, Ann Franchesca, Neimier, Michael, Hu, X. Sharon, Sheng, Xia, Graves, Catherine E., Strachan, John Paul, Li, Can

论文摘要

终身设备学习是机器智能的关键挑战，这需要从几个（通常是单个样本）中学习。已经提出了内存增强神经网络以实现目标，但是由于其尺寸，必须将内存模块存储在芯片内存储器中。因此，实际使用受到了严格的限制。以前的关于基于内存的实现的工作在扩展方面存在很困难，因为具有各种结构的不同模块很难在相同的芯片上集成，并且对于内存模块的内容可寻址存储器的小意义余量很大程度上限制了不匹配计算的程度。在这项工作中，我们在完全集成的回忆横梁平台中实现了整个内存增强的神经网络体系结构，并实现了与Omniglot数据集的数字硬件上的标准软件密切匹配的精度。除了广泛报道的矩阵乘法外，还通过在横杆中实施新功能来支持成功的演示。例如，通过利用Memristor设备的固有随机性，在横杆阵列中实现了局部敏感的哈希操作。此外，在横栏中实现了内容 - 可观的内存模块，这也支持不匹配的程度。基于实验验证的模型的仿真表明，可以有效地扩展这种实现，以在迷你imagenet数据集上进行一次性学习。成功的演示为实践上的终身学习铺平了道路，并为传统硬件中无法实现基于注意力的新型算法打开了可能性。

Lifelong on-device learning is a key challenge for machine intelligence, and this requires learning from few, often single, samples. Memory augmented neural network has been proposed to achieve the goal, but the memory module has to be stored in an off-chip memory due to its size. Therefore the practical use has been heavily limited. Previous works on emerging memory-based implementation have difficulties in scaling up because different modules with various structures are difficult to integrate on the same chip and the small sense margin of the content addressable memory for the memory module heavily limited the degree of mismatch calculation. In this work, we implement the entire memory augmented neural network architecture in a fully integrated memristive crossbar platform and achieve an accuracy that closely matches standard software on digital hardware for the Omniglot dataset. The successful demonstration is supported by implementing new functions in crossbars in addition to widely reported matrix multiplications. For example, the locality-sensitive hashing operation is implemented in crossbar arrays by exploiting the intrinsic stochasticity of memristor devices. Besides, the content-addressable memory module is realized in crossbars, which also supports the degree of mismatches. Simulations based on experimentally validated models show such an implementation can be efficiently scaled up for one-shot learning on the Mini-ImageNet dataset. The successful demonstration paves the way for practical on-device lifelong learning and opens possibilities for novel attention-based algorithms not possible in conventional hardware.

下载PDF全文

下载文献需遵守相关版权规定

论文标题