降低机器学习的方法

论文标题

降低机器学习的方法

Variance-Reduced Methods for Machine Learning

论文作者

Gower, Robert M., Schmidt, Mark, Bach, Francis, Richtarik, Peter

论文摘要

随机优化是机器学习的核心，其基石是随机梯度下降（SGD），这是60年前引入的一种方法。在过去的8年中，人们看到了一个令人兴奋的新发展：降低差异（VR），用于随机优化方法。这些VR方法在允许多次通过训练数据的设置中表现出色，从理论和实践中获得比SGD更快的融合。这些加速度强调了对VR方法的兴趣激增和有关该主题的快速发展的工作。这篇评论涵盖了通过有限数据集优化VR方法背后的关键原理和主要发展，并针对非专家读者。我们主要关注凸设置，并将指针留给有兴趣扩展的读者，以最大程度地减少非凸功能。

Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago. The last 8 years have seen an exciting new development: variance reduction (VR) for stochastic optimization methods. These VR methods excel in settings where more than one pass through the training data is allowed, achieving a faster convergence than SGD in theory as well as practice. These speedups underline the surge of interest in VR methods and the fast-growing body of work on this topic. This review covers the key principles and main developments behind VR methods for optimization with finite data sets and is aimed at non-expert readers. We focus mainly on the convex setting, and leave pointers to readers interested in extensions for minimizing non-convex functions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题