论文标题

通过事件驱动异步分布式任务

Driving asynchronous distributed tasks with events

论文作者

Brown, Nick, Brown, Oliver Thomson, Bull, J. Mark

论文摘要

开源不仅与当前的HPC用户队列有关,还与潜在的新型HPC社区(例如机器学习)有关,它们本身通常植根于开源。从本质上讲,这些潜在的新工作负载中有许多比传统的HPC代码更加异步和不可预测的,并且必须找到开源解决方案,以使新的开发人员能够轻松利用大型平行机器。基于任务的模型有可能在这里提供帮助,但是其中许多模型要么完全从其代码的分布式性质中抽象出用户,因此强调运行时,以做出有关调度和区域性的重要决策,或者要求程序员将基于任务的代码与MPI(例如MPI)(例如MPI)明确相结合。在本文中,我们描述了一种新方法,程序员仍将其代码分为不同的任务,但是明确意识到机器的分布性质,并通过事件来推动任务之间的交互。这提供了两全其美的最好。程序员能够指导并行性的重要方面,而仍然从低级机制中抽象出这种并行性的实现方式。我们通过两个用例(Graph500 BFS基准和MONC的原位数据分析)(一种大气模型)演示了我们的方法。对于这两种应用程序,我们在大量核心计数上都显示出大大提高的性能,而这项工作的结果是一种方法和开源库,很容易适用于广泛的代码。

Open-source matters, not just to the current cohort of HPC users but also to potential new HPC communities, such as machine learning, themselves often rooted in open-source. Many of these potential new workloads are, by their very nature, far more asynchronous and unpredictable than traditional HPC codes and open-source solutions must be found to enable new communities of developers to easily take advantage of large scale parallel machines. Task-based models have the potential to help here, but many of these either entirely abstract the user from the distributed nature of their code, placing emphasis on the runtime to make important decisions concerning scheduling and locality, or require the programmer to explicitly combine their task-based code with a distributed memory technology such as MPI, which adds considerable complexity. In this paper we describe a new approach where the programmer still splits their code up into distinct tasks, but is explicitly aware of the distributed nature of the machine and drives interactions between tasks via events. This provides the best of both worlds; the programmer is able to direct important aspects of parallelism whilst still being abstracted from the low level mechanism of how this parallelism is achieved. We demonstrate our approach via two use-cases, the Graph500 BFS benchmark and in-situ data analytics of MONC, an atmospheric model. For both applications we demonstrate considerably improved performance at large core counts and the result of this work is an approach and open-source library which is readily applicable to a wide range of codes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源