关于铰链损耗最小化的误差抗性

论文标题

关于铰链损耗最小化的误差抗性

On the Error Resistance of Hinge Loss Minimization

论文作者

Talwar, Kunal

论文摘要

机器学习中常用的分类算法，例如支持向量机，最大程度地减少培训示例中的凸替代损失。实际上，这些算法对于培训数据中的错误令人惊讶。在这项工作中，我们确定了一组条件的数据，在这些数据下，这种替代损失最小化算法证明可以学习正确的分类器。这使我们能够在统一的框架中建立这些算法在数据和错误上的鲁棒性。 In particular, we show that if the data is linearly classifiable with a slightly non-trivial margin (i.e. a margin at least $C/\sqrt{d}$ for $d$-dimensional unit vectors), and the class-conditional distributions are near isotropic and logconcave, then surrogate loss minimization has negligible error on the uncorrupted data even when a constant fraction of examples are对抗性标签。

Commonly used classification algorithms in machine learning, such as support vector machines, minimize a convex surrogate loss on training examples. In practice, these algorithms are surprisingly robust to errors in the training data. In this work, we identify a set of conditions on the data under which such surrogate loss minimization algorithms provably learn the correct classifier. This allows us to establish, in a unified framework, the robustness of these algorithms under various models on data as well as error. In particular, we show that if the data is linearly classifiable with a slightly non-trivial margin (i.e. a margin at least $C/\sqrt{d}$ for $d$-dimensional unit vectors), and the class-conditional distributions are near isotropic and logconcave, then surrogate loss minimization has negligible error on the uncorrupted data even when a constant fraction of examples are adversarially mislabeled.

下载PDF全文

下载文献需遵守相关版权规定

论文标题