论文标题
深网的冷门学习学习
Cold Start Streaming Learning for Deep Networks
论文作者
论文摘要
动态调整神经网络的能力而不会导致效果恶化的新数据,这将彻底改变深度学习应用程序。流学习(即一次从一个数据示例中学习)有可能实现这种实时适应,但是当前的方法i)冻结在流中和II期间冻结大部分网络参数,ii)取决于离线,基本初始化程序对大型数据的基本初始化过程,从而损害了性能和限制适用性。为了减轻这些缺点,我们提出了冷水流学习(CSSL),这是一种简单的,端到端的方法,用于使用深层网络流式传输学习,该方法结合了重播和数据增强,以避免灾难性的遗忘。 由于CSSL在流中更新所有模型参数,因此该算法能够从随机初始化开始流式传输,从而使基本初始化可选。进一步,该算法的简单性允许通过分析神经切线随机特征(NTRF)得出理论收敛的保证。在实验中,我们发现在CIFAR100,ImageNet和Core50数据集的实验中,CSSL优于现有的基线用于流学习。此外,我们提出了一种新型的多任务流学习设置,并表明CSSL在该域中表现出色。简而言之,CSSL的表现良好,并证明大多数流媒体方法采用的复杂的多步训练管道可以用一种简单的端到端学习方法代替,而无需牺牲性能。
The ability to dynamically adapt neural networks to newly-available data without performance deterioration would revolutionize deep learning applications. Streaming learning (i.e., learning from one data example at a time) has the potential to enable such real-time adaptation, but current approaches i) freeze a majority of network parameters during streaming and ii) are dependent upon offline, base initialization procedures over large subsets of data, which damages performance and limits applicability. To mitigate these shortcomings, we propose Cold Start Streaming Learning (CSSL), a simple, end-to-end approach for streaming learning with deep networks that uses a combination of replay and data augmentation to avoid catastrophic forgetting. Because CSSL updates all model parameters during streaming, the algorithm is capable of beginning streaming from a random initialization, making base initialization optional. Going further, the algorithm's simplicity allows theoretical convergence guarantees to be derived using analysis of the Neural Tangent Random Feature (NTRF). In experiments, we find that CSSL outperforms existing baselines for streaming learning in experiments on CIFAR100, ImageNet, and Core50 datasets. Additionally, we propose a novel multi-task streaming learning setting and show that CSSL performs favorably in this domain. Put simply, CSSL performs well and demonstrates that the complicated, multi-step training pipelines adopted by most streaming methodologies can be replaced with a simple, end-to-end learning approach without sacrificing performance.