通过数据扩展来缓解联邦学习中的数据异质性

论文标题

通过数据扩展来缓解联邦学习中的数据异质性

Mitigating Data Heterogeneity in Federated Learning with Data Augmentation

论文作者

de Luca, Artur Back, Zhang, Guojun, Chen, Xi, Yu, Yaoliang

论文摘要

联合学习（FL）是一个杰出的框架，可以通过融合本地，分散的模型来确保用户隐私来培训集中式模型。在这种情况下，一个主要障碍是数据异质性，即每个客户具有非相同和独立分布（非IID）数据。这类似于域概括（DG）的上下文，在该上下文中，每个客户端都可以视为不同的域。但是，尽管DG中的许多方法从算法的角度来解决数据异质性，但最近的证据表明，数据增强可以引起相等或更高的性能。在此连接的激励下，我们介绍了流行的DG算法的联合版本，并表明，通过应用适当的数据增强，我们可以在联合设置中减轻数据异质性，并在未见客户端获得更高的准确性。配备了数据增强功能，我们甚至可以使用最基本的联邦平均算法实现最先进的性能，并具有更稀疏的沟通。

Federated Learning (FL) is a prominent framework that enables training a centralized model while securing user privacy by fusing local, decentralized models. In this setting, one major obstacle is data heterogeneity, i.e., each client having non-identically and independently distributed (non-IID) data. This is analogous to the context of Domain Generalization (DG), where each client can be treated as a different domain. However, while many approaches in DG tackle data heterogeneity from the algorithmic perspective, recent evidence suggests that data augmentation can induce equal or greater performance. Motivated by this connection, we present federated versions of popular DG algorithms, and show that by applying appropriate data augmentation, we can mitigate data heterogeneity in the federated setting, and obtain higher accuracy on unseen clients. Equipped with data augmentation, we can achieve state-of-the-art performance using even the most basic Federated Averaging algorithm, with much sparser communication.

下载PDF全文

下载文献需遵守相关版权规定

论文标题