论文标题
CTL ++:评估对已知功能的永无止境组成模式的概括,以及神经表示的兼容性
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations
论文作者
论文摘要
精心设计的诊断任务在研究神经网(NNS)系统概括的失败方面发挥了关键作用。著名的例子包括扫描和组成桌查找(CTL)。在这里,我们介绍了CTL ++,这是一种基于单符号函数组成的新诊断数据集。虽然原始CTL用于测试长度的概括或生产力,但CTL ++旨在测试NNS的系统性,即它们的能力概括为未见的已知功能组成。 CTL ++将功能分为组,并以训练过程中未见的方式对组元素进行测试。我们表明,CTL ++的最近解决CTL变压器变体失败。任务设计的简单性允许对任务难度的细粒度控制以及许多有见地的分析。例如,我们测量测试的NNS需要学习的组合需要多少重叠才能学习组成。我们还可以想象,在成功的情况下,从不同组的功能输出中学习的符号表示方式是兼容的,但在失败的情况下却不兼容。这些结果提供了有关报告自然语言领域中更复杂组成的失败案例的见解。我们的代码是公开的。
Well-designed diagnostic tasks have played a key role in studying the failure of neural nets (NNs) to generalize systematically. Famous examples include SCAN and Compositional Table Lookup (CTL). Here we introduce CTL++, a new diagnostic dataset based on compositions of unary symbolic functions. While the original CTL is used to test length generalization or productivity, CTL++ is designed to test systematicity of NNs, that is, their capability to generalize to unseen compositions of known functions. CTL++ splits functions into groups and tests performance on group elements composed in a way not seen during training. We show that recent CTL-solving Transformer variants fail on CTL++. The simplicity of the task design allows for fine-grained control of task difficulty, as well as many insightful analyses. For example, we measure how much overlap between groups is needed by tested NNs for learning to compose. We also visualize how learned symbol representations in outputs of functions from different groups are compatible in case of success but not in case of failure. These results provide insights into failure cases reported on more complex compositions in the natural language domain. Our code is public.