论文标题

模板的时间矢量化

Temporal Vectorization for Stencils

论文作者

Yuan, Liang, Cao, Hang, Zhang, Yunquan, Li, Kun, Lu, Pengqi, Yue, Yue

论文摘要

模具计算代表了科学和工程应用中非常常见的嵌套环。在现代CPU中利用向量单元对于达到峰值性能至关重要。先前的矢量化方法通常考虑数据空间,特别是最内向的单位循环。它导致了众所周知的数据一致性冲突问题,即由于连续模板计算之间的数据共享,向量负载重叠。本文提出了一种用于模板的新型时间矢量化方案。它在迭代空间中进行了模板计算,并在一个向量中以不同的时间坐标组装点。时间矢量化导致少量固定数量的矢量重组,这与矢量长度,模板顺序和尺寸无关。此外,它也适用于高斯 - 西德尔模具的矢量化。各种Jacobi和高斯 - 塞德尔模具证明了时间矢量化的有效性。

Stencil computations represent a very common class of nested loops in scientific and engineering applications. Exploiting vector units in modern CPUs is crucial to achieving peak performance. Previous vectorization approaches often consider the data space, in particular the innermost unit-strided loop. It leads to the well-known data alignment conflict problem that vector loads are overlapped due to the data sharing between continuous stencil computations. This paper proposes a novel temporal vectorization scheme for stencils. It vectorizes the stencil computation in the iteration space and assembles points with different time coordinates in one vector. The temporal vectorization leads to a small fixed number of vector reorganizations that is irrelevant to the vector length, stencil order, and dimension. Furthermore, it is also applicable to Gauss-Seidel stencils, whose vectorization is not well-studied. The effectiveness of the temporal vectorization is demonstrated by various Jacobi and Gauss-Seidel stencils.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源