稀疏的元网络用于顺序适应及其在自适应语言建模中的应用

论文标题

稀疏的元网络用于顺序适应及其在自适应语言建模中的应用

Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling

论文作者

Munkhdalai, Tsendsuren

论文摘要

训练深度神经网络需要大量的单任务数据，并涉及长时间的优化阶段。这对复杂，现实的环境不可扩展，并具有新的意外变化。人类可以随时进行快速的增量学习，而大脑中的记忆系统起着至关重要的作用。我们介绍了稀疏的元网络 - 一种通过使用深神经网络来学习深层神经网络的在线顺序适应算法的元学习方法。我们增强具有特定层特定快速重量内存的深神经网络。快速体重在每个时间步骤中都稀疏生成，并通过时间逐步积累，为在线持续适应提供有用的电感偏差。从简单的在线增强学习到大规模的自适应语言建模，我们在各种连续的适应方案上都表现出强劲的表现。

Training a deep neural network requires a large amount of single-task data and involves a long time-consuming optimization phase. This is not scalable to complex, realistic environments with new unexpected changes. Humans can perform fast incremental learning on the fly and memory systems in the brain play a critical role. We introduce Sparse Meta Networks -- a meta-learning approach to learn online sequential adaptation algorithms for deep neural networks, by using deep neural networks. We augment a deep neural network with a layer-specific fast-weight memory. The fast-weights are generated sparsely at each time step and accumulated incrementally through time providing a useful inductive bias for online continual adaptation. We demonstrate strong performance on a variety of sequential adaptation scenarios, from a simple online reinforcement learning to a large scale adaptive language modelling.

下载PDF全文

下载文献需遵守相关版权规定

论文标题