论文标题

降低机器学习的方法

Variance-Reduced Methods for Machine Learning

论文作者

Gower, Robert M., Schmidt, Mark, Bach, Francis, Richtarik, Peter

论文摘要

随机优化是机器学习的核心,其基石是随机梯度下降(SGD),这是60年前引入的一种方法。在过去的8年中,人们看到了一个令人兴奋的新发展:降低差异(VR),用于随机优化方法。这些VR方法在允许多次通过训练数据的设置中表现出色,从理论和实践中获得比SGD更快的融合。这些加速度强调了对VR方法的兴趣激增和有关该主题的快速发展的工作。这篇评论涵盖了通过有限数据集优化VR方法背后的关键原理和主要发展,并针对非专家读者。我们主要关注凸设置,并将指针留给有兴趣扩展的读者,以最大程度地减少非凸功能。

Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago. The last 8 years have seen an exciting new development: variance reduction (VR) for stochastic optimization methods. These VR methods excel in settings where more than one pass through the training data is allowed, achieving a faster convergence than SGD in theory as well as practice. These speedups underline the surge of interest in VR methods and the fast-growing body of work on this topic. This review covers the key principles and main developments behind VR methods for optimization with finite data sets and is aimed at non-expert readers. We focus mainly on the convex setting, and leave pointers to readers interested in extensions for minimizing non-convex functions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源