论文标题
BATCHBOOST:稳定训练的正规化,抵抗不足和过度拟合
batchboost: regularization for stabilizing training with resistance to underfitting & overfitting
论文作者
论文摘要
过度拟合和不足和稳定的培训是机器学习的重要挑战。这些问题的当前方法是混合,样本绘制和卑诗省学习。在我们的工作中,我们指出一个假设,即将许多图像混合在一起比仅仅两个。 Batchboost管道有三个阶段:(a)配对:选择两个样本的方法。 (b)混合:如何从两个样本中创建一个新的。 (c)喂食:将混合样品与新样本组合为批处理(以比率$γ$)相结合。请注意,在我们的批处理中出现的样本传播,随后的迭代次数越来越重要,直到培训结束。配对阶段计算每个样本的误差,将样品和对与策略对:最容易的策略,而不是混合阶段使用混合,$ x_1 +(1-λ)x_2 $合并了两个样品。最后,喂养阶段将新样品与比率1:1混合在一起。 Batchboost的准确性比CIFAR-10和Fashion-Mnist上的当前最新混合正规化要高0.5-3%。我们的方法比小型数据集上的样品涂层技术好一些(最高5%)。 BatchBoost提供了有关未调谐参数(例如重量衰减)的稳定培训,因此它是测试不同体系结构性能的好方法。源代码在:https://github.com/maciejczyzewski/batchboost
Overfitting & underfitting and stable training are an important challenges in machine learning. Current approaches for these issues are mixup, SamplePairing and BC learning. In our work, we state the hypothesis that mixing many images together can be more effective than just two. Batchboost pipeline has three stages: (a) pairing: method of selecting two samples. (b) mixing: how to create a new one from two samples. (c) feeding: combining mixed samples with new ones from dataset into batch (with ratio $γ$). Note that sample that appears in our batch propagates with subsequent iterations with less and less importance until the end of training. Pairing stage calculates the error per sample, sorts the samples and pairs with strategy: hardest with easiest one, than mixing stage merges two samples using mixup, $x_1 + (1-λ)x_2$. Finally, feeding stage combines new samples with mixed by ratio 1:1. Batchboost has 0.5-3% better accuracy than the current state-of-the-art mixup regularization on CIFAR-10 & Fashion-MNIST. Our method is slightly better than SamplePairing technique on small datasets (up to 5%). Batchboost provides stable training on not tuned parameters (like weight decay), thus its a good method to test performance of different architectures. Source code is at: https://github.com/maciejczyzewski/batchboost