通过结构标签平滑正规化

论文标题

通过结构标签平滑正规化

Regularization via Structural Label Smoothing

论文作者

Li, Weizhi, Dasarathy, Gautam, Berisha, Visar

论文摘要

正则化是促进机器学习模型的概括性能的有效方法。在本文中，我们专注于标签平滑，这是一种输出分布正则化形式，可以通过软化训练数据中的地面真相标签来防止神经网络过度拟合，以试图惩罚过度自信的输出。现有方法通常使用交叉验证强加这种平滑，这在所有训练数据中都是均匀的。在本文中，我们表明，这种标签平滑性在训练数据的贝叶斯错误率中施加了可量化的偏差，其特征空间的区域具有高重叠和低边缘可能性的区域，其偏见较低，低重叠和高边缘可能性的区域具有较高的偏见。这些理论上的结果激发了数据依赖性平滑的简单目标函数，以减轻操作的潜在负面后果，同时保持其理想的适用性。我们称这种方法为结构标签平滑（SLS）。我们实施SLS，并在合成，Higgs，SVHN，CIFAR-10和CIFAR-100数据集上实现SLS并进行经验验证。结果证实了我们的理论见解，并证明了与传统标签平滑相比，该方法的有效性。

Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network by softening the ground-truth labels in the training data in an attempt to penalize overconfident outputs. Existing approaches typically use cross-validation to impose this smoothing, which is uniform across all training data. In this paper, we show that such label smoothing imposes a quantifiable bias in the Bayes error rate of the training data, with regions of the feature space with high overlap and low marginal likelihood having a lower bias and regions of low overlap and high marginal likelihood having a higher bias. These theoretical results motivate a simple objective function for data-dependent smoothing to mitigate the potential negative consequences of the operation while maintaining its desirable properties as a regularizer. We call this approach Structural Label Smoothing (SLS). We implement SLS and empirically validate on synthetic, Higgs, SVHN, CIFAR-10, and CIFAR-100 datasets. The results confirm our theoretical insights and demonstrate the effectiveness of the proposed method in comparison to traditional label smoothing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题