现实世界中的构图概括，并进行分解的序列学习学习

论文标题

现实世界中的构图概括，并进行分解的序列学习学习

Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning

论文作者

Zheng, Hao, Lapata, Mirella

论文摘要

组成概括是人类语言学习的基本机制，当前的神经网络与之抗争。最近提出的一个序列序列到序列模型（悬挂）通过学习每个解码步骤的专门编码来显示出令人鼓舞的概括能力。我们对该模型介绍了两个关键的修改，该模型鼓励更多的分解表示形式并提高其计算和记忆效率，从而使我们能够在更现实的环境中处理构图概括。具体而言，我们没有在每个时间步骤中自适应重新编码源键和值，而是在某个时间间隔内定期将其表示形式解散，并且仅定期重新编码键。我们的新体系结构可在现有任务和数据集中提供更好的概括性能，以及通过检测与培训集有关的自然构图模式来创建的新机器翻译基准。我们表明，这种方法比人工挑战更好地模拟了现实世界的要求。

Compositional generalization is a basic mechanism in human language learning, which current neural networks struggle with. A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability by learning specialized encodings for each decoding step. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency, allowing us to tackle compositional generalization in a more realistic setting. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically, at some interval. Our new architecture leads to better generalization performance across existing tasks and datasets, and a new machine translation benchmark which we create by detecting naturally occurring compositional patterns in relation to a training set. We show this methodology better emulates real-world requirements than artificial challenges.

下载PDF全文

下载文献需遵守相关版权规定

论文标题