论文标题
记忆对学习顺序到序列任务的影响
The impact of memory on learning sequence-to-sequence tasks
论文作者
论文摘要
神经网络在自然语言处理中的最新成功引起了人们对学习顺序到序列(SEQ2SEQ)任务的重新关注。尽管存在丰富的文献,即使用可解决的神经网络模型研究分类和回归任务,但尚未从这个角度研究SEQ2SEQ任务。在这里,我们为SEQ2SEQ任务提出了一个简单的模型,该模型具有在序列中提供对内存程度或非马克维亚性程度的明确控制的优点 - 随机交换机 - Ernstein-unstein-uhlenbeck(SSOU)模型。我们介绍了一种非马克维亚性的度量,以量化序列中的内存量。对于对这项任务进行培训的最小自动回归(AR)学习模型,我们确定了两个学习方案,这些学习模式与SSOS过程的固定状态相对应。这些阶段来自控制序列统计的两个不同时间尺度之间的相互作用。此外,我们观察到,尽管增加了AR模型的集成窗口始终会提高性能,尽管收益降低,但增加了输入序列的非马克维亚性可以改善或降低其性能。最后,我们对经常性和卷积神经网络进行实验,这些实验表明我们的观察结果延续到了更复杂的神经网络体系结构。
The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences -- the stochastic switching-Ornstein-Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures.