论文标题
非挥发记忆加速几何多尺度分析分析
Non-Volatile Memory Accelerated Geometric Multi-Scale Resolution Analysis
论文作者
论文摘要
降低算法是研究人员工具箱中的标准工具。降低降低算法经常用于增强下游任务,例如机器学习,数据科学,也是理解复杂现象的探索性方法。例如,降低降低通常用于生物学以及神经科学中,以了解从生物学主题中收集的数据。但是,减少维度的技术受其执行的von-Neumann架构的限制。具体而言,数据密集型算法(例如降低降低技术)通常需要快速,高容量,持久的内存,而历史上硬件无法同时提供。在本文中,我们介绍了称为几何多尺度分辨率分析(GMRA)的现有维度降低技术的重新实现,该技术已通过新颖的持久记忆技术加速了,称为内存中心的活性存储(MCAS)。我们的实现使用了名为PYMM的专门版本的MCAS,该版本为Python数据类型提供了本机支持,包括Numpy阵列和Pytorch Tensors。我们将我们的PYMM实施与DRAM实施进行了比较,并表明当数据适合DRAM时,PYMM会提供竞争性的运行时间。当数据不适合DRAM时,我们的PYMM实现仍然能够处理数据。
Dimensionality reduction algorithms are standard tools in a researcher's toolbox. Dimensionality reduction algorithms are frequently used to augment downstream tasks such as machine learning, data science, and also are exploratory methods for understanding complex phenomena. For instance, dimensionality reduction is commonly used in Biology as well as Neuroscience to understand data collected from biological subjects. However, dimensionality reduction techniques are limited by the von-Neumann architectures that they execute on. Specifically, data intensive algorithms such as dimensionality reduction techniques often require fast, high capacity, persistent memory which historically hardware has been unable to provide at the same time. In this paper, we present a re-implementation of an existing dimensionality reduction technique called Geometric Multi-Scale Resolution Analysis (GMRA) which has been accelerated via novel persistent memory technology called Memory Centric Active Storage (MCAS). Our implementation uses a specialized version of MCAS called PyMM that provides native support for Python datatypes including NumPy arrays and PyTorch tensors. We compare our PyMM implementation against a DRAM implementation, and show that when data fits in DRAM, PyMM offers competitive runtimes. When data does not fit in DRAM, our PyMM implementation is still able to process the data.