论文标题
无监督的简单定义生成的多任务框架
Multitasking Framework for Unsupervised Simple Definition Generation
论文作者
论文摘要
定义生成任务可以通过提供陌生单词的解释来帮助语言学习者。近年来,这项任务引起了很多关注。我们提出了一项简单定义生成(SDG)的新任务,以帮助语言学习者和低识字读者。这项任务的一个重大挑战是缺乏多种语言学习者的词典,因此缺乏监督培训的数据。我们探讨了此任务,并提出了一个多任务框架Simpdefiner,该框架仅需要具有复杂定义的标准字典和包含任意简单文本的语料库。我们通过仔细设计两个解码器之间的参数共享方案来解散文本的复杂因素。通过共同训练这些组件,该框架可以同时生成复杂和简单的定义。我们证明该框架可以通过对英语和中文数据集的自动和手动评估来为目标词生成相关的,简单的定义。我们的方法在英语数据集上以1.77纱丽得分优于基线模型,并在中文定义中提高了低级别(HSK 1-3)单词的比例。
The definition generation task can help language learners by providing explanations for unfamiliar words. This task has attracted much attention in recent years. We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. A significant challenge of this task is the lack of learner's dictionaries in many languages, and therefore the lack of data for supervised training. We explore this task and propose a multitasking framework SimpDefiner that only requires a standard dictionary with complex definitions and a corpus containing arbitrary simple texts. We disentangle the complexity factors from the text by carefully designing a parameter sharing scheme between two decoders. By jointly training these components, the framework can generate both complex and simple definitions simultaneously. We demonstrate that the framework can generate relevant, simple definitions for the target words through automatic and manual evaluations on English and Chinese datasets. Our method outperforms the baseline model by a 1.77 SARI score on the English dataset, and raises the proportion of the low level (HSK level 1-3) words in Chinese definitions by 3.87%.