论文标题
聚集联合学习的有效框架
An Efficient Framework for Clustered Federated Learning
论文作者
论文摘要
我们解决了将用户分发并将用户分配到集群中的联合学习(FL)的问题。该设置捕获了设置,其中不同的用户组具有自己的目标(学习任务),但是通过在同一集群中与他人(相同的学习任务)汇总他们的数据,他们可以利用数量的力量以执行更有效的联合学习。对于这种新的联合学习的新框架,我们提出了迭代联合聚类算法(IFCA),该算法交替估算用户的群集身份,并通过梯度下降优化了用户群集的模型参数。我们首先在具有平方损耗的线性模型中分析该算法的收敛速率,然后分析通用强烈凸和平滑损耗函数的收敛速率。我们表明,在两个设置中,具有良好的初始化,IFCA可以保证会收敛,并讨论统计错误率的最佳性。特别是,对于具有两个簇的线性模型,只要初始化比随机性略好,我们可以保证我们的算法会收敛。当聚类结构模棱两可时,我们建议通过将IFCA与多任务学习中的重量共享技术结合起来来训练模型。在实验中,我们表明,即使我们通过随机初始化和多个重新启动放宽了初始化的要求,我们的算法也可以成功。我们还提出了实验结果,表明我们的算法在神经网络等非凸问题中有效。我们证明了IFCA在几个集群基准测试基准上的好处。
We address the problem of federated learning (FL) where users are distributed and partitioned into clusters. This setup captures settings where different groups of users have their own objectives (learning tasks) but by aggregating their data with others in the same cluster (same learning task), they can leverage the strength in numbers in order to perform more efficient federated learning. For this new framework of clustered federated learning, we propose the Iterative Federated Clustering Algorithm (IFCA), which alternately estimates the cluster identities of the users and optimizes model parameters for the user clusters via gradient descent. We analyze the convergence rate of this algorithm first in a linear model with squared loss and then for generic strongly convex and smooth loss functions. We show that in both settings, with good initialization, IFCA is guaranteed to converge, and discuss the optimality of the statistical error rate. In particular, for the linear model with two clusters, we can guarantee that our algorithm converges as long as the initialization is slightly better than random. When the clustering structure is ambiguous, we propose to train the models by combining IFCA with the weight sharing technique in multi-task learning. In the experiments, we show that our algorithm can succeed even if we relax the requirements on initialization with random initialization and multiple restarts. We also present experimental results showing that our algorithm is efficient in non-convex problems such as neural networks. We demonstrate the benefits of IFCA over the baselines on several clustered FL benchmarks.