当没有多数席位时出现：编码器折断神经网络的局限性作为德国复数的认知模型

论文标题

当没有多数席位时出现：编码器折断神经网络的局限性作为德国复数的认知模型

Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

论文作者

McCurdy, Kate, Goldwater, Sharon, Lopez, Adam

论文摘要

人工神经网络能否学会像人类的人一样代表弯曲形态并推广到新单词？ Kirov and Cotterell（2018）认为答案是肯定的：现代的编码器 - 编码器（ED）体系结构在转化英语动词时学习类似人类的行为，例如将常规的过去时态形式扩展到新颖的单词。但是，他们的工作并没有解决Marcus等人提出的批评。（1995年）：神经模型可能会学会扩展不规律的，而是最常见的类别 - 因此在德国数字变化之类的任务上失败了，这些任务仍然可以有效地推广。为了调查这个问题，我们首先从德语说话者（新名词的复数形式的生产和评级）收集了一个新数据集，该数据旨在避免ED模型无法获得的信息来源。扬声器数据显示出较高的可变性，两个后缀表明“常规”行为，以语音非典型的输入出现更常见。编码器模型确实概括了最常见的复数类别，但不会显示出类似人类的变异性或这些其他复数标记的“常规”扩展。我们得出的结论是，现代神经模型仍可能在少数级概括中挣扎。

Can artificial neural networks learn to represent inflectional morphology and generalize to new words as human speakers do? Kirov and Cotterell (2018) argue that the answer is yes: modern Encoder-Decoder (ED) architectures learn human-like behavior when inflecting English verbs, such as extending the regular past tense form -(e)d to novel words. However, their work does not address the criticism raised by Marcus et al. (1995): that neural models may learn to extend not the regular, but the most frequent class -- and thus fail on tasks like German number inflection, where infrequent suffixes like -s can still be productively generalized. To investigate this question, we first collect a new dataset from German speakers (production and ratings of plural forms for novel nouns) that is designed to avoid sources of information unavailable to the ED model. The speaker data show high variability, and two suffixes evince 'regular' behavior, appearing more often with phonologically atypical inputs. Encoder-decoder models do generalize the most frequently produced plural class, but do not show human-like variability or 'regular' extension of these other plural markers. We conclude that modern neural models may still struggle with minority-class generalization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题