关于分布式可分离计算的计算和通信成本之间的权衡

论文标题

关于分布式可分离计算的计算和通信成本之间的权衡

On the Tradeoff Between Computation and Communication Costs for Distributed Linearly Separable Computation

论文作者

Wan, Kai, Sun, Hua, Ji, Mingyue, Caire, Giuseppe

论文摘要

本文研究了分布式线性可分离的计算问题，这是对许多现有的分布式计算问题（例如分布式梯度下降和分布式线性变换）的概括。在此问题中，主人要求$ n $分布式工人计算$ k $数据集的线性可分离函数，该功能是$ k $消息的一组$ k_c $线性组合（每个消息都是一个数据集的函数）。我们为每个工人分配了一些数据集，然后计算相应的消息并返回这些消息的某些函数，从而从$ n $工人的任何$ n_r $的答案中，主人可以恢复任务功能。在文献中，考虑了$ k_c = 1 $或计算成本最低的具体情况。在本文中，我们专注于一般案例（即一般$ k_c $和一般计算成本），并旨在找到最低沟通成本。我们首先提出了一个新颖的匡威束缚，对流行的环状分配的约束（文献中广泛考虑）的沟通成本限制，该数据将数据集分配给工人以循环方式分配给工人。由于观察到的分布式计算的现有策略缺乏达到相反界的策略，我们为某些系统参数提出了一种新颖的分布计算方案。当$ k_c $较大时，提出的计算方案对于任何分配都是最佳的，并且当工人和数据集的数量相等或$ k_c $时，在循环分配下是最佳的。此外，在循环分配下，对于其余情况下，它是最佳的订单。

This paper studies the distributed linearly separable computation problem, which is a generalization of many existing distributed computing problems such as distributed gradient descent and distributed linear transform. In this problem, a master asks $N$ distributed workers to compute a linearly separable function of $K$ datasets, which is a set of $K_c$ linear combinations of $K$ messages (each message is a function of one dataset). We assign some datasets to each worker, which then computes the corresponding messages and returns some function of these messages, such that from the answers of any $N_r$ out of $N$ workers the master can recover the task function. In the literature, the specific case where $K_c = 1$ or where the computation cost is minimum has been considered. In this paper, we focus on the general case (i.e., general $K_c$ and general computation cost) and aim to find the minimum communication cost. We first propose a novel converse bound on the communication cost under the constraint of the popular cyclic assignment (widely considered in the literature), which assigns the datasets to the workers in a cyclic way. Motivated by the observation that existing strategies for distributed computing fall short of achieving the converse bound, we propose a novel distributed computing scheme for some system parameters. The proposed computing scheme is optimal for any assignment when $K_c$ is large and is optimal under cyclic assignment when the numbers of workers and datasets are equal or $K_c$ is small. In addition, it is order optimal within a factor of 2 under cyclic assignment for the remaining cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题