论文标题
从优化动力学到概括范围通过oljasiewicz梯度不等式
From Optimization Dynamics to Generalization Bounds via Łojasiewicz Gradient Inequality
论文作者
论文摘要
优化和概括是统计机器学习的两个基本方面。在本文中,我们提出了一个框架,通过基于梯度流量算法下的优化轨迹来分析概括误差,将优化与概括联系起来。该框架的关键要素是统一的LGI,这是一种训练机器学习模型时通常满足的属性。利用统一-LGI,我们首先得出梯度流量算法的收敛速率,然后为大量的机器学习模型提供了概括。我们进一步将框架应用于三种不同的机器学习模型:线性回归,内核回归和两层神经网络。通过我们的方法,我们获得匹配或扩展先前结果的概括估计值。
Optimization and generalization are two essential aspects of statistical machine learning. In this paper, we propose a framework to connect optimization with generalization by analyzing the generalization error based on the optimization trajectory under the gradient flow algorithm. The key ingredient of this framework is the Uniform-LGI, a property that is generally satisfied when training machine learning models. Leveraging the Uniform-LGI, we first derive convergence rates for gradient flow algorithm, then we give generalization bounds for a large class of machine learning models. We further apply our framework to three distinct machine learning models: linear regression, kernel regression, and two-layer neural networks. Through our approach, we obtain generalization estimates that match or extend previous results.