通过修剪随机梯度下降带有嘈杂标记的网络优化中的正则化

论文标题

通过修剪随机梯度下降带有嘈杂标记的网络优化中的正则化

Regularization in network optimization via trimmed stochastic gradient descent with noisy label

论文作者

Nakamura, Kensuke, Sohn, Bong-Soo, Won, Kyoung-Jae, Hong, Byung-Woo

论文摘要

正则化对于避免在网络优化中过度训练数据至关重要，从而更好地概括了受过训练的网络。标签噪声通过统一的随机标签代替训练示例的目标地面真实标签，从而提供了强大的隐式正则化。但是，由于与不正确的标签相关的巨大损失，可能会导致不良的误导梯度。我们提出了一种一阶优化方法（标记为no的trim-sgd），该方法将标签噪声与示例修剪一起使用，以根据损失去除异常值。提出的算法很简单，但使我们能够施加大型标签，并获得比原始方法更好的正则化效果。定量分析是通过比较标签噪声，示例修剪和提出的算法的行为来进行的。我们还提出了经验结果，这些结果证明了使用主要基准和基本网络的算法的有效性，在该基准和基本网络中，我们的方法成功地超过了最先进的优化方法。

Regularization is essential for avoiding over-fitting to training data in network optimization, leading to better generalization of the trained networks. The label noise provides a strong implicit regularization by replacing the target ground truth labels of training examples by uniform random labels. However, it can cause undesirable misleading gradients due to the large loss associated with incorrect labels. We propose a first-order optimization method (Label-Noised Trim-SGD) that uses the label noise with the example trimming in order to remove the outliers based on the loss. The proposed algorithm is simple yet enables us to impose a large label-noise and obtain a better regularization effect than the original methods. The quantitative analysis is performed by comparing the behavior of the label noise, the example trimming, and the proposed algorithm. We also present empirical results that demonstrate the effectiveness of our algorithm using the major benchmarks and the fundamental networks, where our method has successfully outperformed the state-of-the-art optimization methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题