C-SAW：用于GPU的图形采样和随机步行的框架

论文标题

C-SAW：用于GPU的图形采样和随机步行的框架

C-SAW: A Framework for Graph Sampling and Random Walk on GPUs

论文作者

Pandey, Santosh, Li, Lingda, Hoisie, Adolfy, Li, Xiaoye S., Liu, Hang

论文摘要

许多应用程序需要学习，挖掘，分析和可视化大规模图。这些图通常太大，无法使用常规的图形处理技术有效地解决。许多应用程序需要分析，转换，可视化和学习大规模图。这些图通常太大，无法使用常规的图形处理技术有效地解决。最近的文献表达了图形采样/随机步行可能是一个有效的解决方案。在本文中，据我们所知，我们提出了第一个基于GPU的图形采样/随机步行框架。首先，我们的框架提供了一个通用的API，该API允许用户轻松实现广泛的采样和随机步行算法。其次，在GPU上卸载此框架，我们引入了以扭曲为中心的并行选择，以及两个用于碰撞迁移的优化。第三，为了支持超过GPU内存能力的图表，我们引入了有效的数据传输优化，以进行暂不可存储的采样，例如工作负载感知的调度和批处理的多态采样。整体上，我们的框架不断优于最先进的项目。首先，我们的框架提供了一个通用的API，该API允许用户轻松实现广泛的采样和随机步行算法。其次，在GPU上卸载此框架，我们引入了以扭曲为中心的并行选择，并进行了两个新颖的碰撞迁移优化。第三，为了支持超过GPU内存能力的图形，我们为失去内存和多GPU采样提供了有效的数据传输优化，例如工作负载感知的调度和批处理的多态采样。综上所述，除了支持广泛的采样和随机步行算法的能力外，我们的框架不断优于最先进的项目。

Many applications require to learn, mine, analyze and visualize large-scale graphs. These graphs are often too large to be addressed efficiently using conventional graph processing technologies. Many applications have requirements to analyze, transform, visualize and learn large scale graphs. These graphs are often too large to be addressed efficiently using conventional graph processing technologies. Recent literatures convey that graph sampling/random walk could be an efficient solution. In this paper, we propose, to the best of our knowledge, the first GPU-based framework for graph sampling/random walk. First, our framework provides a generic API which allows users to implement a wide range of sampling and random walk algorithms with ease. Second, offloading this framework on GPU, we introduce warp-centric parallel selection, and two optimizations for collision migration. Third, towards supporting graphs that exceed GPU memory capacity, we introduce efficient data transfer optimizations for out-of-memory sampling, such as workload-aware scheduling and batched multi-instance sampling. In its entirety, our framework constantly outperforms the state-of-the-art projects. First, our framework provides a generic API which allows users to implement a wide range of sampling and random walk algorithms with ease. Second, offloading this framework on GPU, we introduce warp-centric parallel selection, and two novel optimizations for collision migration. Third, towards supporting graphs that exceed the GPU memory capacity, we introduce efficient data transfer optimizations for out-of-memory and multi-GPU sampling, such as workload-aware scheduling and batched multi-instance sampling. Taken together, our framework constantly outperforms the state of the art projects in addition to the capability of supporting a wide range of sampling and random walk algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题