利用混合精度算术的数值方法的调查

论文标题

利用混合精度算术的数值方法的调查

A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic

论文作者

Abdelfattah, Ahmad, Anzt, Hartwig, Boman, Erik G., Carson, Erin, Cojean, Terry, Dongarra, Jack, Gates, Mark, Grützmacher, Thomas, Higham, Nicholas J., Li, Sherry, Lindquist, Neil, Liu, Yang, Loe, Jennifer, Luszczek, Piotr, Nayak, Pratik, Pranesh, Sri, Rajamanickam, Siva, Ribizel, Tobias, Smith, Barry, Swirydowicz, Kasia, Thomas, Stephen, Tomov, Stanimire, Tsai, Yaohung M., Yamazaki, Ichitaro, Yang, Urike Meier

论文摘要

在过去的几年中，硬件供应商已开始设计低精度特殊功能单元，以应对机器学习社区的需求及其对低精度格式的高计算功率的需求。此外，服务器线产品越来越多地具有低精确的特殊功能单元，例如ORNL Summit SuperComputer中的NVIDIA张量核心提供的性能比IEEE Doupder Dould Double Precision中可用的功能更高。同时，一方面的计算功率与内存带宽之间的差距不断增加，与算术操作相比，数据访问和通信的昂贵。为了开始多次重点工作，我们调查了数值线性代数社区，并在本景观分析报告中总结了所有现有的多次交易知识，专业知识和软件功能。我们还包括当前的努力和初步结果，这些结果可能尚未被视为“成熟的技术”，但有可能在多次重点努力中发展成生产质量。正如我们期望读者熟悉数值线性代数的基础知识的那样，我们避免在算法本身上提供详细的背景，但着重于混合和多重复技术如何帮助提高这些方法的性能并呈现应用程序的亮点，以显着超越传统的固定精确方法。

Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the server-line products are increasingly featuring low-precision special function units, such as the NVIDIA tensor cores in ORNL's Summit supercomputer providing more than an order of magnitude higher performance than what is available in IEEE double precision. At the same time, the gap between the compute power on the one hand and the memory bandwidth on the other hand keeps increasing, making data access and communication prohibitively expensive compared to arithmetic operations. To start the multiprecision focus effort, we survey the numerical linear algebra community and summarize all existing multiprecision knowledge, expertise, and software capabilities in this landscape analysis report. We also include current efforts and preliminary results that may not yet be considered "mature technology," but have the potential to grow into production quality within the multiprecision focus effort. As we expect the reader to be familiar with the basics of numerical linear algebra, we refrain from providing a detailed background on the algorithms themselves but focus on how mixed- and multiprecision technology can help improving the performance of these methods and present highlights of application significantly outperforming the traditional fixed precision methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题