了解普通语言中强大的概括

论文标题

了解普通语言中强大的概括

Understanding Robust Generalization in Learning Regular Languages

论文作者

Dan, Soham, Bastani, Osbert, Roth, Dan

论文摘要

人类智能的一个关键特征是能够超越训练分布的能力，例如，解析比过去更长的句子。当前，深层神经网络努力巩固了数据分布的这种转变。我们在使用复发性神经网络（RNN）学习普通语言的背景下研究强大的概括。我们假设标准的端到端建模策略不能很好地推广到系统的分配变化，并提出了解决此问题的组成策略。我们比较了一种端到端策略，该策略将字符串映射到具有组成策略的标签，该策略可以预测接受常规语言的确定性有限状态自动机（DFA）的结构。从理论上讲，我们证明了组成策略比端到端策略明显更好。在我们的实验中，我们通过辅助任务实施组成策略，目的是预测DFA解析字符串时访问的中间状态。我们的经验结果支持我们的假设，表明辅助任务可以实现强大的概括。有趣的是，端到端的RNN概括了比理论下限更好的概括，这表明它至少能够实现一定程度的鲁棒概括。

A key feature of human intelligence is the ability to generalize beyond the training distribution, for instance, parsing longer sentences than seen in the past. Currently, deep neural networks struggle to generalize robustly to such shifts in the data distribution. We study robust generalization in the context of using recurrent neural networks (RNNs) to learn regular languages. We hypothesize that standard end-to-end modeling strategies cannot generalize well to systematic distribution shifts and propose a compositional strategy to address this. We compare an end-to-end strategy that maps strings to labels with a compositional strategy that predicts the structure of the deterministic finite-state automaton (DFA) that accepts the regular language. We theoretically prove that the compositional strategy generalizes significantly better than the end-to-end strategy. In our experiments, we implement the compositional strategy via an auxiliary task where the goal is to predict the intermediate states visited by the DFA when parsing a string. Our empirical results support our hypothesis, showing that auxiliary tasks can enable robust generalization. Interestingly, the end-to-end RNN generalizes significantly better than the theoretical lower bound, suggesting that it is able to achieve at least some degree of robust generalization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题