论文标题
带错误反馈的沟通效率分布式SGD,重新审视
Communication-Efficient Distributed SGD with Error-Feedback, Revisited
论文作者
论文摘要
我们表明,使用Zheng等人的错误反馈,最近一种称为分布式随机梯度下降的算法的收敛证明,称为dist-ef-sgd。 (Neurips 2019)在数学上是有问题的。具体而言,不幸的是,对于任意学习率的任意序列的原始误差是不正确的,从而导致算法收敛定理中无效的上限。作为证据,我们明确地为凸和非凸案例提供了几个反例,以显示误差绑定的不正确性。我们通过提供新的错误及其相应的证明来解决问题,从而导致Dist-EF-SGD算法的新收敛定理,从而恢复其数学分析。
We show that the convergence proof of a recent algorithm called dist-EF-SGD for distributed stochastic gradient descent with communication efficiency using error-feedback of Zheng et al. (NeurIPS 2019) is problematic mathematically. Concretely, the original error bound for arbitrary sequences of learning rate is unfortunately incorrect, leading to an invalidated upper bound in the convergence theorem for the algorithm. As evidences, we explicitly provide several counter-examples, for both convex and non-convex cases, to show the incorrectness of the error bound. We fix the issue by providing a new error bound and its corresponding proof, leading to a new convergence theorem for the dist-EF-SGD algorithm, and therefore recovering its mathematical analysis.