论文标题
平均场兰格文动力学的凸分析
Convex Analysis of the Mean Field Langevin Dynamics
论文作者
论文摘要
作为非线性Fokker-Planck方程的一个例子,由于其与(嘈杂的)梯度下降在平均场状态下无限宽的神经网络上的(嘈杂)下降,因此最近引起了人们的注意,因此动态的收敛性具有很大的理论兴趣。在这项工作中,我们在连续和离散时间设置中相对于(正则化)目标函数的平均场langevin动力学提供了简洁且独立的收敛速率分析。我们证明的关键要素是与动态相关的近端Gibbs分布$ P_Q $,该$与[Vempala和Wibisono(2019)]中的技术结合使用,它使我们能够在Convex Optimization中与经典成果相似地与经典结果相似。此外,我们揭示了$ p_q $在经验风险最小化设置中连接到偶性差距,这可以对算法收敛进行有效的经验评估。
As an example of the nonlinear Fokker-Planck equation, the mean field Langevin dynamics recently attracts attention due to its connection to (noisy) gradient descent on infinitely wide neural networks in the mean field regime, and hence the convergence property of the dynamics is of great theoretical interest. In this work, we give a concise and self-contained convergence rate analysis of the mean field Langevin dynamics with respect to the (regularized) objective function in both continuous and discrete time settings. The key ingredient of our proof is a proximal Gibbs distribution $p_q$ associated with the dynamics, which, in combination with techniques in [Vempala and Wibisono (2019)], allows us to develop a simple convergence theory parallel to classical results in convex optimization. Furthermore, we reveal that $p_q$ connects to the duality gap in the empirical risk minimization setting, which enables efficient empirical evaluation of the algorithm convergence.