通过交流压缩的个性化联合学习

论文标题

通过交流压缩的个性化联合学习

Personalized Federated Learning with Communication Compression

论文作者

Bergou, El Houcine, Burlachenko, Konstantin, Dutta, Aritra, Richtárik, Peter

论文摘要

与训练数据中心的传统机器学习（ML）模型相反，联邦学习（FL）训练ML模型，这些模型在资源受限的异质边缘设备上包含的本地数据集上。现有的FL算法旨在为所有参与的设备学习一个单一的全球模型，这可能对参加培训的所有设备都没有帮助，这是由于设备跨设备的数据的异质性。最近，Hanzely和Richtárik（2020）提出了一种用于培训个性化模型的新配方，旨在平衡传统的全球模型与本地模型之间的权衡，该模型只能使用其私人数据来培训单个设备。他们得出了一种称为无环梯度下降（L2GD）的新算法，以解决该算法，并表明该算法会在需要更多个性化的情况下，可以改善沟通复杂性。在本文中，我们为其L2GD算法配备了双向压缩机制，以进一步减少本地设备和服务器之间的通信瓶颈。与FL设备中使用的其他基于压缩的算法不同，我们的压缩L2GD算法在概率通信协议上运行，在概率通信协议中，通信不会按固定的时间表进行。此外，我们的压缩L2GD算法在没有压缩的情况下保持与香草SGD相似的收敛速率。为了验证算法的效率，我们在凸和非凸问题上都进行了多种数值实验，并使用各种压缩技术。

In contrast to training traditional machine learning (ML) models in data centers, federated learning (FL) trains ML models over local datasets contained on resource-constrained heterogeneous edge devices. Existing FL algorithms aim to learn a single global model for all participating devices, which may not be helpful to all devices participating in the training due to the heterogeneity of the data across the devices. Recently, Hanzely and Richtárik (2020) proposed a new formulation for training personalized FL models aimed at balancing the trade-off between the traditional global model and the local models that could be trained by individual devices using their private data only. They derived a new algorithm, called Loopless Gradient Descent (L2GD), to solve it and showed that this algorithms leads to improved communication complexity guarantees in regimes when more personalization is required. In this paper, we equip their L2GD algorithm with a bidirectional compression mechanism to further reduce the communication bottleneck between the local devices and the server. Unlike other compression-based algorithms used in the FL-setting, our compressed L2GD algorithm operates on a probabilistic communication protocol, where communication does not happen on a fixed schedule. Moreover, our compressed L2GD algorithm maintains a similar convergence rate as vanilla SGD without compression. To empirically validate the efficiency of our algorithm, we perform diverse numerical experiments on both convex and non-convex problems and using various compression techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题