论文标题
通过结构标签平滑正规化
Regularization via Structural Label Smoothing
论文作者
论文摘要
正则化是促进机器学习模型的概括性能的有效方法。在本文中,我们专注于标签平滑,这是一种输出分布正则化形式,可以通过软化训练数据中的地面真相标签来防止神经网络过度拟合,以试图惩罚过度自信的输出。现有方法通常使用交叉验证强加这种平滑,这在所有训练数据中都是均匀的。在本文中,我们表明,这种标签平滑性在训练数据的贝叶斯错误率中施加了可量化的偏差,其特征空间的区域具有高重叠和低边缘可能性的区域,其偏见较低,低重叠和高边缘可能性的区域具有较高的偏见。这些理论上的结果激发了数据依赖性平滑的简单目标函数,以减轻操作的潜在负面后果,同时保持其理想的适用性。我们称这种方法为结构标签平滑(SLS)。我们实施SLS,并在合成,Higgs,SVHN,CIFAR-10和CIFAR-100数据集上实现SLS并进行经验验证。结果证实了我们的理论见解,并证明了与传统标签平滑相比,该方法的有效性。
Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network by softening the ground-truth labels in the training data in an attempt to penalize overconfident outputs. Existing approaches typically use cross-validation to impose this smoothing, which is uniform across all training data. In this paper, we show that such label smoothing imposes a quantifiable bias in the Bayes error rate of the training data, with regions of the feature space with high overlap and low marginal likelihood having a lower bias and regions of low overlap and high marginal likelihood having a higher bias. These theoretical results motivate a simple objective function for data-dependent smoothing to mitigate the potential negative consequences of the operation while maintaining its desirable properties as a regularizer. We call this approach Structural Label Smoothing (SLS). We implement SLS and empirically validate on synthetic, Higgs, SVHN, CIFAR-10, and CIFAR-100 datasets. The results confirm our theoretical insights and demonstrate the effectiveness of the proposed method in comparison to traditional label smoothing.