论文标题
糖霜的重量以进行更好的持续训练
Frosting Weights for Better Continual Training
论文作者
论文摘要
培训神经网络模型可以是终生学习过程,并且是计算密集型的学习过程。在深层神经网络模型中可能发生的严重不利影响是,它们可能会遭受灾难性的遗忘,在重新培训新数据期间。为了避免在持续学习中的这种干扰,一个吸引人的财产是整体模型的加性本质。在本文中,我们提出了两种通用的集合方法,即梯度提升和元学习,以解决调整预训练的神经网络模型时的灾难性遗忘问题。
Training a neural network model can be a lifelong learning process and is a computationally intensive one. A severe adverse effect that may occur in deep neural network models is that they can suffer from catastrophic forgetting during retraining on new data. To avoid such disruptions in the continuous learning, one appealing property is the additive nature of ensemble models. In this paper, we propose two generic ensemble approaches, gradient boosting and meta-learning, to solve the catastrophic forgetting problem in tuning pre-trained neural network models.