论文标题
大型神经网络从头开始学习,很少有数据,而没有明确的正则化
Large Neural Networks Learning from Scratch with Very Few Data and without Explicit Regularization
论文作者
论文摘要
最近的发现表明,高度过度参数化的神经网络在没有预处理或显式正则化的情况下概括。通过零训练错误,即通过记住训练数据来实现这一目标。这是令人惊讶的,因为它完全违背了传统的机器学习智慧。在我们的实证研究中,我们将这些发现加固在细粒图像分类的领域中。我们表明,具有数百万磅重的非常大的卷积神经网络确实只有少数训练样本学习,并且没有图像增强,明确的正则化或预处理。我们在困难的基准数据集Caltech101,Cub_200_2011,FGVCairCraft,Flowers102和StanfordCars的子集上对体系结构RESNET018,RESNET101和VGG19进行培训,并具有100个类别,以及更多类别,对CNN的实用应用进行全面的比较研究,并含义是全面的比较研究。最后,我们表明,具有1.4亿重量的随机初始化的VGG19学会了将飞机和摩托车区分开,只需使用20个训练样本,其准确度高达95%。
Recent findings have shown that highly over-parameterized Neural Networks generalize without pretraining or explicit regularization. It is achieved with zero training error, i.e., complete over-fitting by memorizing the training data. This is surprising, since it is completely against traditional machine learning wisdom. In our empirical study we fortify these findings in the domain of fine-grained image classification. We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples and without image augmentation, explicit regularization or pretraining. We train the architectures ResNet018, ResNet101 and VGG19 on subsets of the difficult benchmark datasets Caltech101, CUB_200_2011, FGVCAircraft, Flowers102 and StanfordCars with 100 classes and more, perform a comprehensive comparative study and draw implications for the practical application of CNNs. Finally, we show that a randomly initialized VGG19 with 140 million weights learns to distinguish airplanes and motorbikes with up to 95% accuracy using only 20 training samples per class.